记录一个今天画出来的数据统计图(绝美,当然数据是癌症相关的就不是很美了,
之前一直都用plt.plot,也不太会用
但是现在发现seaborn真的可以
palette = sns.color_palette("ocean", 2)
sns.kdeplot(data=cancer_data,x='Radius (worst)',shade=True,hue='Diagnosis',palette=palette)
调颜色:颜色链接
Trends - A trend is defined as a pattern of change.
sns.lineplot - Line charts are best to show trends over a period of time, and multiple lines can be used to show trends in more than one group.
例1:绘制多条曲线
# Line chart showing the number of visitors to each museum over time
plt.figure(figsize=(16,6))
plt.title("asd")
sns.lineplot(data=museum_data) # Your code here
plt.xlabel('data')
# Check your answer
#step_3.check()
例2:绘制单条曲线:
plt.figure(figsize=(16,6))
sns.lineplot(data=museum_data['Avila Adobe']);# Your code here
Relationship - There are many different chart types that you can use to understand relationships between variables in your data.
sns.barplot - Bar charts are useful for comparing quantities corresponding to different groups.
例1:
plt.figure(figsize=(8,6))
# Bar chart showing average score for racing games by platform
sns.barplot(x=ign_data['Racing'],y=ign_data.index) # Your code here
吹爆了,这真的好好看,换个调色板看看,这种多颜色的不适合用一个色调来画,这个就挺好看的了:
当我用我最喜欢的Blues来画时:(嘤,丑哭了)
这个要注意的是,最好把值都放在横坐标上,不然名称堆叠在横坐标上不好看。
sns.heatmap - Heatmaps can be used to find color-coded patterns in tables of numbers.
plt.figure(figsize=(10,10))
sns.heatmap(data=ign_data,annot=True) # Your code here
这个图片注意的是,如果数字显示不清楚的话,可以调大画布。
sns.scatterplot - Scatter plots show the relationship between two continuous variables; if color-coded, we can also show the relationship with a third categorical variable.
sns.scatterplot(x='pricepercent',
y='winpercent',
hue='chocolate',
data=candy_data)
加上hue参数之后可以按照该参数给点分类
sns.regplot - Including a regression line in the scatter plot makes it easier to see any linear relationship between two variables.
这个可以在散点图里面把回归曲线画出来,而且画出来的曲线也巨好看
sns.regplot(x='sugarpercent',
y='winpercent',
data=candy_data)
sns.lmplot - This command is useful for drawing multiple regression lines, if the scatter plot contains multiple, color-coded groups.
如果想要多条回归曲线的话:
p=sns.color_palette('winter',2)
sns.lmplot(x='pricepercent',
y='winpercent',
hue='chocolate',
data=candy_data,
palette=p) # Your code here
我真的要吹爆这个图
sns.swarmplot - Categorical scatter plots show the relationship between a continuous variable and a categorical variable.
sns.swarmplot(x='chocolate',
y='winpercent',
data=candy_data)
这个能描述出来两个特征
Distribution - We visualize distributions to show the possible values that we can expect to see in a variable, along with how likely they are.
sns.histplot - Histograms show the distribution of a single numerical variable.
palette=sns.color_palette('Blues',2)
sns.histplot(data=cancer_data,x='Area (mean)',hue='Diagnosis',color='Blues',palette=palette)
plt.legend( [' benign ','malignant'])
如果想要kde曲线的话:
palette=sns.color_palette('PuBu',2)
sns.histplot(data=cancer_data,x='Area (mean)',hue='Diagnosis',color='Blues',palette=palette,kde=True)
plt.legend( [' benign ','malignant'])
sns.kdeplot - KDE plots (or 2D KDE plots) show an estimated, smooth distribution of a single numerical variable (or two numerical variables).
kde可以看作是hitogram的一种平滑
palette = sns.color_palette("ocean", 2)
sns.kdeplot(data=cancer_data,x='Radius (worst)',shade=True,hue='Diagnosis',palette=palette)
注意bar图是两个离散序列之间的关系,而histogram这种是统计每个数据出现的次数。
sns.jointplot - This command is useful for simultaneously displaying a 2D KDE plot with the corresponding KDE plots for each individual variable.