散点图和回归线
- 两个不同数值变量的值在散点图中用点或圆圈表示。每个点在水平轴和垂直轴中的位置表示单个数据点的值。
- 散点图有利于观察变量之间的关系。
- 回归线是最适合数据的直线,从线到图表上绘制的点的总距离最小。
安装
pip install altair
在本文中的数据集,我们使用vega_datasets包,在命令提示符下安装输入以下命令。
pip install vega_datasets
示例1:airport数据集上带有回归线的默认散点图
# importing libraries
import altair as alt
from vega_datasets import data
# importing airports dataset from
# vega_datasets package
airport = data.airports()
# making the scatter plot on latitude and longitude
fig = alt.Chart(airport).mark_point().encode(x='latitude',y='longitude')
# making the regression line using transform_regression
# function and add with the scatter plot
final_plot = fig + fig.transform_regression('latitude','longitude').mark_line()
# saving the scatter plot with regression line
final_plot.save('output1.html')
输出
示例2:使用airport数据集设置颜色,绘制带有回归线的散点图
# importing libraries
import altair as alt
from vega_datasets import data
# importing airports dataset from vega_datasets package
airport = data.airports()
# making the scatter plot on latitude and longitude
# setting color on the basis of country
fig = alt.Chart(airport).mark_point().encode(
x='latitude',y='longitude',color='country')
# making the regression line using transform_regression
# function and add with the scatter plot
final_plot = fig + fig.transform_regression('latitude','longitude').mark_line()
# saving the scatter plot with regression line
final_plot.save('output2.html')
输出
示例3:使用seattle_weather数据集绘制带有回归线的默认散点图
# importing libraries
import altair as alt
from vega_datasets import data
# importing weather dataset from vega_datasets package
weather_data = data.seattle_weather()
# making the scatter plot on temp_max and temp_min
fig = alt.Chart(weather_data).mark_point().encode(x='temp_max',y='temp_min')
# making the regression line using transform_regression
# function and add with the scatter plot
final_plot = fig + fig.transform_regression('temp_max','temp_min').mark_line()
# saving the scatter plot with regression line
final_plot.save('output3.html')
输出
示例4:seattle_weather数据集设置颜色绘制带有回归线的散点图
# importing libraries
import altair as alt
from vega_datasets import data
# importing weather dataset from vega_datasets package
weather_data = data.seattle_weather()
# making the scatter on temp_max and temp_min
fig = alt.Chart(weather_data).mark_point().encode(
x='temp_max',y='temp_min',color='weather')
# making the regression line using transform_regression
# function and add with the scatter plot
final_plot = fig + fig.transform_regression('temp_max','temp_min').mark_line()
# saving the scatter plot with regression line
final_plot.save('output4.html')
输出