文章目录
- 四、实例
- 4.1 Plotly Express 的小提琴图
- 4.1.1 Plotly Express 的基本小提琴图
- 4.1.2 带框和数据点的小提琴图
- 4.1.3 多个小提琴图
- 4.1.4 叠加的小提琴图
- 4.2 graph_objects的小提琴图
- 4.2.1 基本小提琴图
- 4.2.2 多条小提琴迹线
- 4.2.3 分组小提琴图
- 4.2.4 分裂小提琴图
- 4.2.5 高级小提琴图
- 4.2.6 脊线图
- 4.2.7 只有点的小提琴图
- 4.2.8 Dash中的应用
四、实例
4.1 Plotly Express 的小提琴图
小提琴图是数字数据的统计表示。它类似于箱线图,在每一侧都增加了一个旋转的核密度图。
用于可视化分布的小提琴图的替代方法包括直方图、箱线图、ECDF 图和条形图。
4.1.1 Plotly Express 的基本小提琴图
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.violin(df, y="total_bill")
fig.show()
4.1.2 带框和数据点的小提琴图
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.violin(df, y="total_bill", box=True, # 在小提琴内部绘制方框图
points='all', # 可以是 'outliers', or False
)
fig.show()
4.1.3 多个小提琴图
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.violin(df, y="tip", x="smoker", color="sex", box=True, points="all",
hover_data=df.columns)
fig.show()
4.1.4 叠加的小提琴图
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.violin(df, y="tip", color="sex",
# 默认violinmode是'group'
violinmode='overlay', # 把小提琴放在彼此的上面
hover_data=df.columns)
fig.show()
4.2 graph_objects的小提琴图
如果Plotly Express 没有提供好的起点,您可以使用plotly.graph_objects的go.Violin所有选项go.Violin都记录在参考https://plotly.com/python/reference/violin/中
4.2.1 基本小提琴图
import plotly.graph_objects as go
import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = go.Figure(data=go.Violin(y=df['total_bill'], box_visible=True, line_color='black',
meanline_visible=True, fillcolor='lightseagreen', opacity=0.6,
x0='Total Bill'))
fig.update_layout(yaxis_zeroline=False)
fig.show()
4.2.2 多条小提琴迹线
import plotly.graph_objects as go
import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = go.Figure()
days = ['Thur', 'Fri', 'Sat', 'Sun']
for day in days:
fig.add_trace(go.Violin(x=df['day'][df['day'] == day],
y=df['total_bill'][df['day'] == day],
name=day,
box_visible=True,
meanline_visible=True))
fig.show()
4.2.3 分组小提琴图
import plotly.graph_objects as go
import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = go.Figure()
fig.add_trace(go.Violin(x=df['day'][ df['sex'] == 'Male' ],
y=df['total_bill'][ df['sex'] == 'Male' ],
legendgroup='M', scalegroup='M', name='M',
line_color='blue')
)
fig.add_trace(go.Violin(x=df['day'][ df['sex'] == 'Female' ],
y=df['total_bill'][ df['sex'] == 'Female' ],
legendgroup='F', scalegroup='F', name='F',
line_color='orange')
)
fig.update_traces(box_visible=True, meanline_visible=True)
fig.update_layout(violinmode='group')
fig.show()
4.2.4 分裂小提琴图
import plotly.graph_objects as go
import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = go.Figure()
fig.add_trace(go.Violin(x=df['day'][ df['smoker'] == 'Yes' ],
y=df['total_bill'][ df['smoker'] == 'Yes' ],
legendgroup='Yes', scalegroup='Yes', name='Yes',
side='negative',
line_color='blue')
)
fig.add_trace(go.Violin(x=df['day'][ df['smoker'] == 'No' ],
y=df['total_bill'][ df['smoker'] == 'No' ],
legendgroup='No', scalegroup='No', name='No',
side='positive',
line_color='orange')
)
fig.update_traces(meanline_visible=True)
fig.update_layout(violingap=0, violinmode='overlay')
fig.show()
4.2.5 高级小提琴图
import plotly.graph_objects as go
import pandas as pd
# 'https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv'
df = pd.read_csv("f:/violin_data.csv")
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = go.Figure()
pointpos_male = [-0.9,-1.1,-0.6,-0.3]
pointpos_female = [0.45,0.55,1,0.4]
show_legend = [True,False,False,False]
fig = go.Figure()
for i in range(0,len(pd.unique(df['day']))):
fig.add_trace(go.Violin(x=df['day'][(df['sex'] == 'Male') &
(df['day'] == pd.unique(df['day'])[i])],
y=df['total_bill'][(df['sex'] == 'Male')&
(df['day'] == pd.unique(df['day'])[i])],
legendgroup='M', scalegroup='M', name='M',
side='negative',
pointpos=pointpos_male[i], # 在哪里定位点
line_color='lightseagreen',
showlegend=show_legend[i])
)
fig.add_trace(go.Violin(x=df['day'][(df['sex'] == 'Female') &
(df['day'] == pd.unique(df['day'])[i])],
y=df['total_bill'][(df['sex'] == 'Female')&
(df['day'] == pd.unique(df['day'])[i])],
legendgroup='F', scalegroup='F', name='F',
side='positive',
pointpos=pointpos_female[i],
line_color='mediumpurple',
showlegend=show_legend[i])
)
# 更新所有跟踪共享的特征
fig.update_traces(meanline_visible=True,
points='all', # 显示所有要点
jitter=0.05, # 在点上添加一些抖动以获得更好的可见性
scalemode='count') # 用总计数缩放绘图区域
fig.update_layout(
title_text="总账单分配<br><i>按每个性别的账单数量缩放",
violingap=0, violingroupgap=0, violinmode='overlay')
fig.show()
4.2.6 脊线图
脊线图(以前称为 Joy Plot)显示了几个组的数值分布。它们可用于可视化分布随时间或空间的变化。
import plotly.graph_objects as go
from plotly.colors import n_colors
import numpy as np
np.random.seed(1)
# 12组正态分布的随机数据,平均值和标准差都在增加
data = (np.linspace(1, 2, 12)[:, np.newaxis] * np.random.randn(12, 200) +
(np.arange(12) + 2 * np.random.random(12))[:, np.newaxis])
colors = n_colors('rgb(5, 200, 200)', 'rgb(200, 10, 10)', 12, colortype='rgb')
fig = go.Figure()
for data_line, color in zip(data, colors):
fig.add_trace(go.Violin(x=data_line, line_color=color))
fig.update_traces(orientation='h', side='positive', width=3, points=False)
fig.update_layout(xaxis_showgrid=False, xaxis_zeroline=False)
fig.show()
4.2.7 只有点的小提琴图
条形图就像一个带有点的小提琴图,没有小提琴:
import plotly.express as px
df = px.data.tips()
print(df)
'''
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
.. ... ... ... ... ... ... ...
239 29.03 5.92 Male No Sat Dinner 3
240 27.18 2.00 Female Yes Sat Dinner 2
241 22.67 2.00 Male Yes Sat Dinner 2
242 17.82 1.75 Male No Sat Dinner 2
243 18.78 3.00 Female No Thur Dinner 2
[244 rows x 7 columns]
'''
fig = px.strip(df, x='day', y='tip')
fig.show()
4.2.8 Dash中的应用
import plotly.graph_objects as go # or plotly.express as px
fig = go.Figure() # or any Plotly Express function e.g. px.bar(...)
# fig.add_trace( ... )
# fig.update_layout( ... )
import dash
import dash_core_components as dcc
import dash_html_components as html
app = dash.Dash()
app.layout = html.Div([
dcc.Graph(figure=fig)
])
app.run_server(debug=True, use_reloader=False) # Turn off reloader if inside Jupyter