[数据分析与可视化] 基于matplotlib和plottable库绘制精美表格

news2024/9/22 11:41:05

plottable是一个Python库,用于在matplotlib中绘制精美定制的图形表格。plottable的官方仓库地址为:plottable。本文主要参考其官方文档,plottable的官方文档地址为:plottable-doc。plottable安装命令如下:

pip install plottable

本文所有代码见:Python-Study-Notes

# jupyter notebook环境去除warning
import warnings
warnings.filterwarnings("ignore")
import plottable
# 打印plottable版本
print('plottable version:', plottable.__version__)
# 打印matplotlib版本
import matplotlib as plt
print('matplotlib version:', plt.__version__)
plottable version: 0.1.5
matplotlib version: 3.5.3

文章目录

  • 1 使用说明
    • 1.1 基础使用
    • 1.2 列的样式自定义
    • 1.3 行列自定义
  • 2 绘图实例
    • 2.1 多行样式设置
    • 2.2 自定义单元格效果
    • 2.3 热图
    • 2.4 女子世界杯预测数据展示
    • 2.5 德甲积分排名榜展示
  • 3 参考

1 使用说明

1.1 基础使用

下面的代码展示了一个简单的图形表格绘制示例,plottable提供了Table类以创建和展示图形表格。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from plottable import Table

# 生成一个包含随机数据的表格
d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# 基于pandas表格数据创建和展示图形表格
tab = Table(d)

# 保存图片
plt.savefig("table.jpg", dpi=300,bbox_inches='tight')
plt.show()

png

对于plottable的Table类,其构造参数介绍如下:

  • df: pd.DataFrame, 要显示为表格的DataFrame对象
  • ax: mpl.axes.Axes, 绘制表格的坐标轴对象,默认为None
  • index_col: str, DataFrame中的索引列名。默认为None
  • columns: List[str], 哪些列用于绘图。为None表示使用所有列
  • column_definitions: List[ColumnDefinition], 需要设置样式列的style定义类,默认为None
  • textprops: Dict[str, Any], 文本属性的字典,默认为空字典
  • cell_kw: Dict[str, Any], 单元格属性的字典,默认为空字典
  • col_label_cell_kw: Dict[str, Any], 列标签单元格属性的字典,默认为空字典
  • col_label_divider: bool, 是否在列标签下方绘制分隔线,默认为True。
  • footer_divider: bool, 是否在表格下方绘制分隔线,默认为False。
  • row_dividers: bool, 是否显示行分隔线,默认为True
  • row_divider_kw: Dict[str, Any], 行分隔线属性的字典,默认为空字典
  • col_label_divider_kw: Dict[str, Any], 列标签分隔线属性的字典,默认为空字典
  • footer_divider_kw: Dict[str, Any], 页脚分隔线属性的字典,默认为空字典
  • column_border_kw: Dict[str, Any], 列边框属性的字典,默认为空字典
  • even_row_color: str | Tuple, 偶数行单元格的填充颜色,默认为None
  • odd_row_color: str | Tuple, 奇数行单元格的填充颜色,默认为None

在这些参数之中,控制表格绘图效果的参数有以下几类:

  • column_definitions:列的样式自定义
  • textprops:文本的样样式自定义
  • cell_kw:表格单元格的样式自定义
  • 其他设置参数的样式

在这些参数中,最重要的参数是column_definitions,因为column_definitions可以控制几乎所有的绘图效果。接下来本文主要对column_definitions的使用进行具体介绍。

1.2 列的样式自定义

plottable提供了ColumnDefinition类(别名ColDef)来自定义图形表格的单个列的样式。ColumnDefinition类的构造参数如下:

  • name: str,要设置绘图效果的列名
  • title: str = None,用于覆盖列名的绘图标题
  • width: float = 1,列的宽度,默认情况下各列的宽度为轴的宽度/列的总数
  • textprops: Dict[str, Any] = field(default_factory=dict),提供给每个文本单元格的文本属性
  • formatter: Callable = None,用于格式化文本外观的可调用函数
  • cmap: Callable = None,根据单元格的值返回颜色的可调用函数
  • text_cmap: Callable = None,根据单元格的值返回颜色的可调用函数
  • group: str = None,设置每个组都会在列标签上方显示的分组列标签
  • plot_fn: Callable = None,一个可调用函数,将单元格的值作为输入,并在每个单元格上创建一个子图并绘制在其上
    要向其传递其他参数
  • plot_kw: Dict[str, Any] = field(default_factory=dict),提供给plot_fn的附加关键字参数
  • border: str | List = None,绘制垂直边界线,可以是"left" / “l”、“right” / “r"或"both”

通过ColumnDefinition类来设置Table类的column_definitions参数,可以实现不同表格列样式的效果。如果是同时多个列的绘图效果,则需要使用[ColumnDefinition,ColumnDefinition]列表的形式。一些使用示例如下

设置列标题和列宽

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from plottable import ColumnDefinition, ColDef, Table

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# name表示设置哪个列的样式
tab = Table(d, column_definitions=[ColumnDefinition(name="A", title="Title A"),
                                   ColumnDefinition(name="D", width=2)])

plt.show()

png

设置列的文字属性和文本格式

from plottable.formatters import decimal_to_percent

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# index列的文字居中,加粗
# 列A的文本数值改为百分制
tab = Table(d, column_definitions=[ColumnDefinition(name="index", textprops={"ha": "center", "weight": "bold"}),
                                   ColumnDefinition(name="A", formatter=decimal_to_percent)])

plt.show()

png

设置列单元格背景色和字体颜色

from plottable.cmap import normed_cmap
import matplotlib.cm

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# cmap设置单元格背景色
tab = Table(d, column_definitions=[ColumnDefinition(name="A", cmap=matplotlib.cm.tab20, text_cmap=matplotlib.cm.Reds),
                                   ColumnDefinition(name="B", cmap=matplotlib.cm.tab20b),
                                   ColumnDefinition(name="C", text_cmap=matplotlib.cm.tab20c)])

plt.show()

png

设置列的分组名

from plottable.cmap import normed_cmap
import matplotlib.cm

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# 将列B和列C视为同一组,该组命名为group_name
tab = Table(d, column_definitions=[ColumnDefinition(name="B", group="group_name"), 
                                   ColumnDefinition(name="C", group="group_name")])

plt.show()

png

设置列边框

from plottable.cmap import normed_cmap
import matplotlib.cm

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# 将列B和列C视为同一组,该组命名为group_name
tab = Table(d, column_definitions=[ColumnDefinition(name="A", border="l"), 
                                   ColumnDefinition(name="C",  border="both")])

plt.show()

png

调用函数的使用

ColumnDefinition类的plot_fn和plot_kw参数提供了自定义函数实现表格效果绘制的功能。其中plot_fn表示待调用的函数,plot_kw表示待调用函数的输入参数。此外在plotable.plots预置了一些效果函数,我们可以参考这些效果函数定义自己的绘图函数。预置效果函数如下:

from pathlib import Path
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib.colors import LinearSegmentedColormap
from plottable import ColumnDefinition, Table
# 调用预置绘图函数
from plottable.plots import image,monochrome_image,circled_image,bar,percentile_bars,percentile_stars,progress_donut

cmap = matplotlib.cm.tab20
# 柱状图绘制
fig, ax = plt.subplots(figsize=(1, 1))
# 0.7表示数值,lw边框线宽
b = bar(ax, 0.7, plot_bg_bar=True, cmap=cmap, annotate=True, lw=2, height=0.35)
plt.show()

png

# 星星百分比图
fig, ax = plt.subplots(figsize=(2, 1))
stars = percentile_stars(ax, 85, background_color="#f0f0f0")

png

# 圆环图
fig, ax = plt.subplots(figsize=(1, 1))
donut = progress_donut(ax, 73, textprops={"fontsize": 14})
plt.show()

png

对于待调用的函数,可以通过help函数查看这些函数的参数含义。

help(progress_donut)
Help on function progress_donut in module plottable.plots:

progress_donut(ax: matplotlib.axes._axes.Axes, val: float, radius: float = 0.45, color: str = None, background_color: str = None, width: float = 0.05, is_pct: bool = False, textprops: Dict[str, Any] = {}, formatter: Callable = None, **kwargs) -> List[matplotlib.patches.Wedge]
    Plots a Progress Donut on the axes.
    
    Args:
        ax (matplotlib.axes.Axes): Axes
        val (float): value
        radius (float, optional):
            radius of the progress donut. Defaults to 0.45.
        color (str, optional):
            color of the progress donut. Defaults to None.
        background_color (str, optional):
            background_color of the progress donut where the value is not reached. Defaults to None.
        width (float, optional):
            width of the donut wedge. Defaults to 0.05.
        is_pct (bool, optional):
            whether the value is given not as a decimal, but as a value between 0 and 100.
            Defaults to False.
        textprops (Dict[str, Any], optional):
            textprops passed to ax.text. Defaults to {}.
        formatter (Callable, optional):
            a string formatter.
            Can either be a string format, ie "{:2f}" for 2 decimal places.
            Or a Callable that is applied to the value. Defaults to None.
    
    Returns:
        List[matplotlib.patches.Wedge]

通过plot_fn和plot_kw参数设置自定义绘图函数和函数输入参数,可以展示不同的绘图效果,如下所示:

from plottable.cmap import normed_cmap
import matplotlib.cm

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# plot_fn和plot_kw
tab = Table(d, textprops={"ha": "center"},
            column_definitions=[ColumnDefinition(name="B", plot_fn=bar,plot_kw={'plot_bg_bar':True,'cmap':cmap, 
                                'annotate':True, 'height':0.5}),
                                ColumnDefinition(name="D", plot_fn=progress_donut,plot_kw={'is_pct':True,})])

plt.show()

png

自定义文字格式

plottable提供了以下三个自定义函数来表示不同的文字格式:

  • decimal_to_percent:将数值数据变为百分比
  • tickcross:将数值格式化为✔或✖
  • signed_integer:添加正负符号

我们可以通过ColumnDefinition的formatter来设置文字格式,如下所示:

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from plottable import ColumnDefinition, Table
from plottable.formatters import decimal_to_percent,tickcross,signed_integer

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
tab = Table(d, column_definitions=[ColumnDefinition(name="A", formatter=decimal_to_percent),
                                   ColumnDefinition(name="C", formatter=tickcross),
                                   ColumnDefinition(name="D", formatter=signed_integer)])

plt.show()

png

此外,也可以自定义函数来设置文本格式,如下所示:

def setformat(x):
    # 使用format格式函数
    return "{:.2e}".format(x)

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
tab = Table(d, textprops={"ha": "center"},column_definitions=[ColumnDefinition(name="B", formatter=setformat),
                                   ColumnDefinition(name="D", formatter=lambda x: round(x, 2))])

plt.show()

png

1.3 行列自定义

访问行列单元格

plottable提供了直接访问Table实例的某一行、某一列的方法,如下所示:

from plottable.cmap import normed_cmap
import matplotlib.cm

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# 实例化Table对象
tab = Table(d)

png

# 根据列名,提取整列
tab.columns['A']
Column(cells=[TextCell(xy=(1, 0), content=0.0, row_idx=0, col_idx=1), TextCell(xy=(1, 1), content=0.09, row_idx=1, col_idx=1), TextCell(xy=(1, 2), content=0.95, row_idx=2, col_idx=1), TextCell(xy=(1, 3), content=0.08, row_idx=3, col_idx=1), TextCell(xy=(1, 4), content=0.92, row_idx=4, col_idx=1)], index=1)
# 读取某列第1行的内容
tab.columns['B'].cells[1].content
0.04
# 根据行索引,提取整行
tab.rows[1]
Row(cells=[TextCell(xy=(0, 1), content=1, row_idx=1, col_idx=0), TextCell(xy=(1, 1), content=0.09, row_idx=1, col_idx=1), TextCell(xy=(2, 1), content=0.04, row_idx=1, col_idx=2), TextCell(xy=(3, 1), content=0.42, row_idx=1, col_idx=3), TextCell(xy=(4, 1), content=0.64, row_idx=1, col_idx=4), TextCell(xy=(5, 1), content=0.26, row_idx=1, col_idx=5)], index=1)
# 提取表头列名
tab.col_label_row
Row(cells=[TextCell(xy=(0, -1), content=index, row_idx=-1, col_idx=0), TextCell(xy=(1, -1), content=A, row_idx=-1, col_idx=1), TextCell(xy=(2, -1), content=B, row_idx=-1, col_idx=2), TextCell(xy=(3, -1), content=C, row_idx=-1, col_idx=3), TextCell(xy=(4, -1), content=D, row_idx=-1, col_idx=4), TextCell(xy=(5, -1), content=E, row_idx=-1, col_idx=5)], index=-1)

设置单元格样式

在上面的例子可以看到plottable直接访问表格行列对象,因此我们可以通过设置这些对象的绘图属性来直接更改其绘图效果或文字效果,所支持更改的属性如下:

  • 单元格属性
    • set_alpha:设置单元格的透明度。
    • set_color:设置单元格的颜色。
    • set_edgecolor:设置单元格边缘的颜色。
    • set_facecolor:设置单元格内部的颜色。
    • set_fill:设置单元格是否填充。
    • set_hatch:设置单元格的填充图案。
    • set_linestyle:设置单元格边缘线的样式。
    • set_linewidth:设置单元格边缘线的宽度。
  • 字体属性
    • set_fontcolor:设置字体的颜色。
    • set_fontfamily:设置字体的家族。
    • set_fontsize:设置字体的大小。
    • set_ha:设置文本的水平对齐方式。
    • set_ma:设置文本的垂直对齐方式。

示例代码如下:

from plottable.cmap import normed_cmap
import matplotlib.cm

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)
fig, ax = plt.subplots(figsize=(6, 5))
# 实例化Table对象
tab = Table(d)
# 设置行号为1的行的背景颜色
tab.rows[1].set_facecolor("grey")
# 设置B列的字体颜色
tab.columns['B'].set_fontcolor("red")
Column(cells=[TextCell(xy=(2, 0), content=0.38, row_idx=0, col_idx=2), TextCell(xy=(2, 1), content=0.69, row_idx=1, col_idx=2), TextCell(xy=(2, 2), content=0.15, row_idx=2, col_idx=2), TextCell(xy=(2, 3), content=0.74, row_idx=3, col_idx=2), TextCell(xy=(2, 4), content=0.41, row_idx=4, col_idx=2)], index=2)

png

2 绘图实例

2.1 多行样式设置

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from plottable import Table

d = pd.DataFrame(np.random.random((5, 5)), columns=["A", "B", "C", "D", "E"]).round(2)

fig, ax = plt.subplots(figsize=(6, 3))

# row_dividers显示行的分割线
# odd_row_color奇数行颜色
# even_row_color偶数行颜色
tab = Table(d, row_dividers=False, odd_row_color="#f0f0f0", even_row_color="#e0f6ff")

plt.show()

fig.savefig("table.jpg",dpi=300,bbox_inches='tight')

png

2.2 自定义单元格效果

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib.colors import LinearSegmentedColormap

from plottable import ColumnDefinition, Table
from plottable.formatters import decimal_to_percent
from plottable.plots import bar, percentile_bars, percentile_stars, progress_donut

# 自定义颜色
cmap = LinearSegmentedColormap.from_list(
    name="BuYl", colors=["#01a6ff", "#eafedb", "#fffdbb", "#ffc834"], N=256
)

fig, ax = plt.subplots(figsize=(6, 6))

d = pd.DataFrame(np.random.random((5, 4)), columns=["A", "B", "C", "D"]).round(2)

tab = Table(
    d,
    cell_kw={
        "linewidth": 0,
        "edgecolor": "k",
    },
    textprops={"ha": "center"},
    column_definitions=[
        ColumnDefinition("index", textprops={"ha": "left"}),
        ColumnDefinition("A", plot_fn=percentile_bars, plot_kw={"is_pct": True}),
        ColumnDefinition(
            "B", width=1.5, plot_fn=percentile_stars, plot_kw={"is_pct": True}
        ),
        ColumnDefinition(
            "C",
            plot_fn=progress_donut,
            plot_kw={
                "is_pct": True,
                "formatter": "{:.0%}"
                },
            ),
        ColumnDefinition(
            "D",
            width=1.25,
            plot_fn=bar,
            plot_kw={
                "cmap": cmap,
                "plot_bg_bar": True,
                "annotate": True,
                "height": 0.5,
                "lw": 0.5,
                "formatter": decimal_to_percent,
            },
        ),
    ],
)

plt.show()

png

2.3 热图

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib.colors import LinearSegmentedColormap
# ColDef是ColumnDefinition的别名
from plottable import ColDef, Table

# 自定义颜色
cmap = LinearSegmentedColormap.from_list(
    name="bugw", colors=["#ffffff", "#f2fbd2", "#c9ecb4", "#93d3ab", "#35b0ab"], N=256
)
# 创建数据
cities = [
    "TORONTO",
    "VANCOUVER",
    "HALIFAX",
    "CALGARY",
    "OTTAWA",
    "MONTREAL",
    "WINNIPEG",
    "EDMONTON",
    "LONDON",
    "ST. JONES",
]
months = [
    "JAN",
    "FEB",
    "MAR",
    "APR",
    "MAY",
    "JUN",
    "JUL",
    "AUG",
    "SEP",
    "OCT",
    "NOV",
    "DEC",
]

data = np.random.random((10, 12)) + np.abs(np.arange(12) - 5.5)
data = (1 - data / (np.max(data)))
data.shape
(10, 12)
# 绘图
d = pd.DataFrame(data, columns=months, index=cities).round(2)
fig, ax = plt.subplots(figsize=(14, 5))

# 自定义各列的绘图效果
column_definitions = [
    ColDef(name, cmap=cmap, formatter=lambda x: "") for name in months
] + [ColDef("index", title="", width=1.5, textprops={"ha": "right"})]

tab = Table(
    d,
    column_definitions=column_definitions,
    row_dividers=False,
    col_label_divider=False,
    textprops={"ha": "center", "fontname": "Roboto"},
    # 设置各个单元格的效果
    cell_kw={
        "edgecolor": "black",
        "linewidth": 0,
    },
)


# 设置列标题文字和背景颜色
tab.col_label_row.set_facecolor("white")
tab.col_label_row.set_fontcolor("black")
# 设置行标题文字和背景颜色
tab.columns["index"].set_facecolor("black")
tab.columns["index"].set_fontcolor("white")
tab.columns["index"].set_linewidth(0)

plt.show()

png

2.4 女子世界杯预测数据展示

step1 准备数据

下载示例数据,所有示例数据在plottable-example_notebooks。

# 下载数据集
# !wget https://raw.githubusercontent.com/znstrider/plottable/master/docs/example_notebooks/data/wwc_forecasts.csv
from pathlib import Path

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from matplotlib.colors import LinearSegmentedColormap

from plottable import ColumnDefinition, Table
from plottable.cmap import normed_cmap
from plottable.formatters import decimal_to_percent
from plottable.plots import circled_image # image
cols = [
    "team",
    "points",
    "group",
    "spi",
    "global_o",
    "global_d",
    "group_1",
    "group_2",
    "group_3",
    "make_round_of_16",
    "make_quarters",
    "make_semis",
    "make_final",
    "win_league",
]

# 读取数据
df = pd.read_csv(
    "data/wwc_forecasts.csv",
    usecols=cols,
)

# 展示数据
df.head()
teamgroupspiglobal_oglobal_dgroup_1group_2group_3make_round_of_16make_quartersmake_semismake_finalwin_leaguepoints
0USAF98.327485.525610.581790.829560.170440.000001.00.780790.473070.350760.236186
1FranceA96.296714.313750.521370.994830.005150.000021.00.783670.420520.300380.194286
2GermanyB93.765493.967910.678180.984830.015170.000001.00.892800.480390.277100.122566
3CanadaE93.515993.675370.569800.388300.611700.000001.00.591920.361400.201570.090316
4EnglandD91.923113.515600.637170.705700.294300.000001.00.685100.430530.164650.080036

此外,我们需要准备每个国家对应的国旗图片,该数据也在plottable-example_notebooks下。

# 读取图片路径
flag_paths = list(Path("data/country_flags").glob("*.png"))
country_to_flagpath = {p.stem: p for p in flag_paths}

step2 数据处理

该步需要合并数据,将其转换为plottable可用的数据结构。

# 重置列名
colnames = [
    "Team",
    "Points",
    "Group",
    "SPI",
    "OFF",
    "DEF",
    "1st Place",
    "2nd Place",
    "3rd Place",
    "Make Rd Of 16",
    "Make Quarters",
    "Make Semis",
    "Make Finals",
    "Win World Cup",
]

col_to_name = dict(zip(cols, colnames))
col_to_name
{'team': 'Team',
 'points': 'Points',
 'group': 'Group',
 'spi': 'SPI',
 'global_o': 'OFF',
 'global_d': 'DEF',
 'group_1': '1st Place',
 'group_2': '2nd Place',
 'group_3': '3rd Place',
 'make_round_of_16': 'Make Rd Of 16',
 'make_quarters': 'Make Quarters',
 'make_semis': 'Make Semis',
 'make_final': 'Make Finals',
 'win_league': 'Win World Cup'}
df[["spi", "global_o", "global_d"]] = df[["spi", "global_o", "global_d"]].round(1)

df = df.rename(col_to_name, axis=1)
# 删除Points列
df = df.drop("Points", axis=1)
# 插入列
df.insert(0, "Flag", df["Team"].apply(lambda x: country_to_flagpath.get(x)))
df = df.set_index("Team")
df.head()
FlagGroupSPIOFFDEF1st Place2nd Place3rd PlaceMake Rd Of 16Make QuartersMake SemisMake FinalsWin World Cup
Team
USAdata/country_flags/USA.pngF98.35.50.60.829560.170440.000001.00.780790.473070.350760.23618
Francedata/country_flags/France.pngA96.34.30.50.994830.005150.000021.00.783670.420520.300380.19428
Germanydata/country_flags/Germany.pngB93.84.00.70.984830.015170.000001.00.892800.480390.277100.12256
Canadadata/country_flags/Canada.pngE93.53.70.60.388300.611700.000001.00.591920.361400.201570.09031
Englanddata/country_flags/England.pngD91.93.50.60.705700.294300.000001.00.685100.430530.164650.08003

step3 绘图

# 设置颜色
cmap = LinearSegmentedColormap.from_list(
    name="bugw", colors=["#ffffff", "#f2fbd2", "#c9ecb4", "#93d3ab", "#35b0ab"], N=256
)
team_rating_cols = ["SPI", "OFF", "DEF"]
group_stage_cols = ["1st Place", "2nd Place", "3rd Place"]
knockout_stage_cols = list(df.columns[-5:])

# 单独设置每一列的绘制参数
col_defs = (
    # 绘制第一部分效果
    [
        ColumnDefinition(
            name="Flag",
            title="",
            textprops={"ha": "center"},
            width=0.5,
            # 设置自定义效果展示函数
            plot_fn=circled_image,
        ),
        ColumnDefinition(
            name="Team",
            textprops={"ha": "left", "weight": "bold"},
            width=1.5,
        ),
        ColumnDefinition(
            name="Group",
            textprops={"ha": "center"},
            width=0.75,
        ),
        ColumnDefinition(
            name="SPI",
            group="Team Rating",
            textprops={"ha": "center"},
            width=0.75,
        ),
        ColumnDefinition(
            name="OFF",
            width=0.75,
            textprops={
                "ha": "center",
                # 设置填充方式
                "bbox": {"boxstyle": "circle", "pad": 0.35},
            },
            cmap=normed_cmap(df["OFF"], cmap=matplotlib.cm.PiYG, num_stds=2.5),
            group="Team Rating",
        ),
        ColumnDefinition(
            name="DEF",
            width=0.75,
            textprops={
                "ha": "center",
                "bbox": {"boxstyle": "circle", "pad": 0.35},
            },
            cmap=normed_cmap(df["DEF"], cmap=matplotlib.cm.PiYG_r, num_stds=2.5),
            group="Team Rating",
        ),
    ]
    # 绘制第二部分效果
    + [
        ColumnDefinition(
            name=group_stage_cols[0],
            title=group_stage_cols[0].replace(" ", "\n", 1),
            formatter=decimal_to_percent,
            group="Group Stage Chances",
            # 设置边框
            border="left",
        )
    ]
    + [
        ColumnDefinition(
            name=col,
            title=col.replace(" ", "\n", 1),
            formatter=decimal_to_percent,
            group="Group Stage Chances",
        )
        for col in group_stage_cols[1:]
    ]
    # 绘制第三部分效果
    + [
        ColumnDefinition(
            name=knockout_stage_cols[0],
            title=knockout_stage_cols[0].replace(" ", "\n", 1),
            formatter=decimal_to_percent,
            cmap=cmap,
            group="Knockout Stage Chances",
            border="left",
        )
    ]
    + [
        ColumnDefinition(
            name=col,
            title=col.replace(" ", "\n", 1),
            formatter=decimal_to_percent,
            cmap=cmap,
            group="Knockout Stage Chances",
        )
        for col in knockout_stage_cols[1:]
    ]
)
# 绘图
fig, ax = plt.subplots(figsize=(18, 18))

table = Table(
    df,
    column_definitions=col_defs,
    row_dividers=True,
    footer_divider=True,
    ax=ax,
    textprops={"fontsize": 14},
    row_divider_kw={"linewidth": 1, "linestyle": (0, (1, 5))},
    col_label_divider_kw={"linewidth": 1, "linestyle": "-"},
    column_border_kw={"linewidth": 1, "linestyle": "-"},
).autoset_fontcolors(colnames=["OFF", "DEF"])


plt.show()
# 保存图片
fig.savefig("wwc_table.jpg", facecolor=ax.get_facecolor(), dpi=300,bbox_inches='tight')

png

2.5 德甲积分排名榜展示

step1 准备数据

from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from plottable import ColDef, Table
from plottable.plots import image
# 下载联赛数据
# !wget https://projects.fivethirtyeight.com/soccer-api/club/spi_matches.csv
# !wget https://projects.fivethirtyeight.com/soccer-api/club/spi_matches_latest.csv
# 数据地址
FIVETHIRTYEIGHT_URLS = {
    "SPI_MATCHES": "https://projects.fivethirtyeight.com/soccer-api/club/spi_matches.csv",
    "SPI_MATCHES_LATEST": "https://projects.fivethirtyeight.com/soccer-api/club/spi_matches_latest.csv",
}

# 读取数据
# df = pd.read_csv(FIVETHIRTYEIGHT_URLS["SPI_MATCHES_LATEST"])
df = pd.read_csv("data/spi_matches_latest.csv")
df.head()
seasondateleague_idleagueteam1team2spi1spi2prob1prob2...importance1importance2score1score2xg1xg2nsxg1nsxg2adj_score1adj_score2
020192019-03-011979Chinese Super LeagueShandong LunengGuizhou Renhe48.2237.830.57550.1740...45.922.11.00.01.390.262.050.541.050.00
120192019-03-011979Chinese Super LeagueShanghai GreenlandShanghai SIPG39.8160.080.23870.5203...25.663.40.04.00.572.760.801.500.003.26
220192019-03-011979Chinese Super LeagueGuangzhou EvergrandeTianjin Quanujian65.5939.990.78320.0673...77.128.83.00.00.490.451.050.753.150.00
320192019-03-011979Chinese Super LeagueWuhan ZallBeijing Guoan32.2554.820.22760.5226...35.858.90.01.01.120.971.510.940.001.05
420192019-03-011979Chinese Super LeagueChongqing LifanGuangzhou RF38.2440.450.44030.2932...26.221.32.02.02.773.171.052.082.102.10

5 rows × 23 columns

# 筛选德甲联赛数据,并删除为空数据
bl = df.loc[df.league == "German Bundesliga"].dropna()
bl.head()
seasondateleague_idleagueteam1team2spi1spi2prob1prob2...importance1importance2score1score2xg1xg2nsxg1nsxg2adj_score1adj_score2
49720222022-08-051845German BundesligaEintracht FrankfurtBayern Munich68.4791.750.13500.6796...32.671.91.06.00.834.500.652.721.055.96
51420222022-08-061845German BundesligaVfL BochumMainz60.7368.880.35680.3629...33.534.51.02.01.001.620.960.861.052.10
51520222022-08-061845German BundesligaBorussia MonchengladbachTSG Hoffenheim69.3866.770.48720.2742...40.233.33.01.01.860.102.510.312.361.05
51620222022-08-061845German BundesligaVfL WolfsburgWerder Bremen68.1859.820.53190.2014...30.233.32.02.00.810.971.071.252.102.10
51720222022-08-061845German Bundesliga1. FC Union BerlinHertha Berlin69.9859.700.54790.1860...34.933.03.01.01.250.401.660.363.151.05

5 rows × 23 columns

step2 数据处理

# 统计得分
def add_points(df: pd.DataFrame) -> pd.DataFrame:
    # 三元表达式
    # df["score1"] > df["score2"],则返回3
    # np.where(df["score1"] == df["score2"],则返回1
    # 否则为0
    df["pts_home"] = np.where(
        df["score1"] > df["score2"], 3, np.where(df["score1"] == df["score2"], 1, 0)
    )
    df["pts_away"] = np.where(
        df["score1"] < df["score2"], 3, np.where(df["score1"] == df["score2"], 1, 0)
    )
    
    return df

# 统计得分数据
bl = add_points(bl)
# 总得分、总进球数、总助攻数和总黄牌数

# 以下代码先分别统计team1和team2的得分数据,然后将两组数据相加
perform = (
    bl.groupby("team1")[[
        "pts_home",
        "score1",
        "score2",
        "xg1",
        "xg2",
    ]]
    .sum()
    .set_axis(
        [
            "pts",
            "gf",
            "ga",
            "xgf",
            "xga",
        ],
        axis=1,
    )
    .add(
        bl.groupby("team2")[[
            "pts_away",
            "score2",
            "score1",
            "xg2",
            "xg1",
        ]]
        .sum()
        .set_axis(
            [
                "pts",
                "gf",
                "ga",
                "xgf",
                "xga",
            ],
            axis=1,
        )
    )
)

# 由于python和pandas版本问题,如果上面的代码出问题,则使用下面代码
# t1= bl.groupby("team1")[["pts_home","score1","score2","xg1","xg2", ]]
# t1 = t1.sum()
# t1.set_axis( ["pts","gf","ga","xgf","xga", ], axis=1,)
# t2 = bl.groupby("team1")[["pts_home","score1","score2","xg1","xg2", ]]
# t2 = t2.sum()
# t2.set_axis( ["pts","gf","ga","xgf","xga", ], axis=1,)
# perform = (t1.add(t2))

perform.shape
(18, 5)
# 汇总得分表现数据
perform.index.name = "team"

perform["gd"] = perform["gf"] - perform["ga"]

perform = perform[
    [
        "pts",
        "gd",
        "gf",
        "ga",
        "xgf",
        "xga",
    ]
]

perform["games"] = bl.groupby("team1").size().add(bl.groupby("team2").size())
perform.head()
ptsgdgfgaxgfxgagames
team
1. FC Union Berlin6213.051.038.035.9343.0634
Bayer Leverkusen508.057.049.053.6248.2034
Bayern Munich7154.092.038.084.9340.1234
Borussia Dortmund7139.083.044.075.9642.6934
Borussia Monchengladbach43-3.052.055.053.0558.8834
# 统计各队的胜负数据
def get_wins_draws_losses(games: pd.DataFrame) -> pd.DataFrame:
    return (
        games.rename({"pts_home": "pts", "team1": "team"}, axis=1)
        .groupby("team")["pts"]
        .value_counts()
        .add(
            games.rename({"pts_away": "pts", "team2": "team"}, axis=1)
            .groupby("team")["pts"]
            .value_counts(),
            fill_value=0,
        )
        .astype(int)
        .rename("count")
        .reset_index(level=1)
        .pivot(columns="pts", values="count")
        .rename({0: "L", 1: "D", 3: "W"}, axis=1)[["W", "D", "L"]]
    )

wins_draws_losses = get_wins_draws_losses(bl)
wins_draws_losses.head()
ptsWDL
team
1. FC Union Berlin1888
Bayer Leverkusen14812
Bayern Munich2185
Borussia Dortmund2257
Borussia Monchengladbach111013
# 合并得分和胜负数据
perform = pd.concat([perform, wins_draws_losses], axis=1)
perform.head()
ptsgdgfgaxgfxgagamesWDL
team
1. FC Union Berlin6213.051.038.035.9343.06341888
Bayer Leverkusen508.057.049.053.6248.203414812
Bayern Munich7154.092.038.084.9340.12342185
Borussia Dortmund7139.083.044.075.9642.69342257
Borussia Monchengladbach43-3.052.055.053.0558.8834111013

step3 映射队标图片

队标图片地址为:plottable-example_notebooks

# 创建队名和队标的索引数据
club_logo_path = Path("data/bundesliga_crests_22_23")
club_logo_files = list(club_logo_path.glob("*.png"))
club_logos_paths = {f.stem: f for f in club_logo_files}
perform = perform.reset_index()

# 添加新列
perform.insert(0, "crest", perform["team"])
perform["crest"] = perform["crest"].replace(club_logos_paths)

# 数据排序
perform = perform.sort_values(by=["pts", "gd", "gf"], ascending=False)

for colname in ["gd", "gf", "ga"]:
    perform[colname] = perform[colname].astype("int32")

perform["goal_difference"] = perform["gf"].astype(str) + ":" + perform["ga"].astype(str)

# 添加排名
perform["rank"] = list(range(1, 19))

perform.head()
crestteamptsgdgfgaxgfxgagamesWDLgoal_differencerank
2data/bundesliga_crests_22_23/Bayern Munich.pngBayern Munich7154923884.9340.1234218592:381
3data/bundesliga_crests_22_23/Borussia Dortmund...Borussia Dortmund7139834475.9642.6934225783:442
10data/bundesliga_crests_22_23/RB Leipzig.pngRB Leipzig6623644167.0137.4834206864:413
0data/bundesliga_crests_22_23/1. FC Union Berli...1. FC Union Berlin6213513835.9343.0634188851:384
11data/bundesliga_crests_22_23/SC Freiburg.pngSC Freiburg597514453.1152.2534178951:445

step4 设定绘图数据

# 设置颜色
row_colors = {
    "top4": "#2d3636",
    "top6": "#516362",
    "playoffs": "#8d9386",
    "relegation": "#c8ab8d",
    "even": "#627979",
    "odd": "#68817e",
}

bg_color = row_colors["odd"]
text_color = "#e0e8df"
# 确定绘图列
table_cols = ["crest", "team", "games", "W", "D", "L", "goal_difference", "gd", "pts"]
perform[table_cols].head()
crestteamgamesWDLgoal_differencegdpts
2data/bundesliga_crests_22_23/Bayern Munich.pngBayern Munich34218592:385471
3data/bundesliga_crests_22_23/Borussia Dortmund...Borussia Dortmund34225783:443971
10data/bundesliga_crests_22_23/RB Leipzig.pngRB Leipzig34206864:412366
0data/bundesliga_crests_22_23/1. FC Union Berli...1. FC Union Berlin34188851:381362
11data/bundesliga_crests_22_23/SC Freiburg.pngSC Freiburg34178951:44759
# 定义各列绘图效果
table_col_defs = [
    ColDef("rank", width=0.5, title=""),
    ColDef("crest", width=0.35, plot_fn=image, title=""),
    ColDef("team", width=2.5, title="", textprops={"ha": "left"}),
    ColDef("games", width=0.5, title="Games"),
    ColDef("W", width=0.5),
    ColDef("D", width=0.5),
    ColDef("L", width=0.5),
    ColDef("goal_difference", title="Goals"),
    ColDef("gd", width=0.5, title="", formatter="{:+}"),
    ColDef("pts", border="left", title="Points"),
]

step5 绘图

fig, ax = plt.subplots(figsize=(14, 12))

plt.rcParams["text.color"] = text_color
# 设置绘图字体
# plt.rcParams["font.family"] = "Roboto"

# 设置背景颜色
fig.set_facecolor(bg_color)
ax.set_facecolor(bg_color)

table = Table(
    perform,
    column_definitions=table_col_defs,
    row_dividers=True,
    col_label_divider=False,
    footer_divider=True,
    index_col="rank",
    columns=table_cols,
    even_row_color=row_colors["even"],
    footer_divider_kw={"color": bg_color, "lw": 2},
    row_divider_kw={"color": bg_color, "lw": 2},
    column_border_kw={"color": bg_color, "lw": 2},
    # 如果设置字体需要添加"fontname": "Roboto"
    textprops={"fontsize": 16, "ha": "center"},
)


# 设置不同行的颜色
for idx in [0, 1, 2, 3]:
    table.rows[idx].set_facecolor(row_colors["top4"])
    
for idx in [4, 5]:
    table.rows[idx].set_facecolor(row_colors["top6"])
    
table.rows[15].set_facecolor(row_colors["playoffs"])

for idx in [16, 17]:
    table.rows[idx].set_facecolor(row_colors["relegation"])
    table.rows[idx].set_fontcolor(row_colors["top4"])


fig.savefig(
    "bohndesliga_table_recreation.png",
    facecolor=fig.get_facecolor(),
    bbox_inches='tight',
    dpi=300,
)

png

3 参考

  • plottable
  • plottable-doc
  • plottable-example_notebooks

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/739984.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

猿人web学刷题18

1.第十八题 jsvmp - 猿人学 问题: 1.第一页请求正常能返回数据 2.第二页开始之后出现{"error": "Unexpected token/Validation failed"} 分析&#xff1a; 1.第二页开始&#xff0c;有带加密参数&#xff0c;直接重发请求无果&#xff0c;应该带了时间戳…

尚硅谷Docker实战教程-笔记11【高级篇,Docker网络】

尚硅谷大数据技术-教程-学习路线-笔记汇总表【课程资料下载】视频地址&#xff1a;尚硅谷Docker实战教程&#xff08;docker教程天花板&#xff09;_哔哩哔哩_bilibili 尚硅谷Docker实战教程-笔记01【基础篇&#xff0c;Docker理念简介、官网介绍、平台入门图解、平台架构图解】…

盛格塾暑期公益课程《学活LINUX》

学习LINUX有很多种方法&#xff0c;本系列课程以动手试验为主&#xff0c;取一个活的LINUX系统&#xff08;GDK8&#xff09;作为目标&#xff0c;使用内核调试器&#xff08;挥码枪&#xff09;将其中断到调试器&#xff0c;在调试器的帮助下&#xff0c;观察调用过程、执行现…

【1++的Linux】之基础开发工具

&#x1f44d;作者主页&#xff1a;进击的1 &#x1f929; 专栏链接&#xff1a;【1的Linux】 文章目录 一&#xff0c;Linux软件包管理管理器二&#xff0c;Linux编辑器--vim2.1 什么是vim2.2 vim的基本操作 三&#xff0c;gcc的使用四&#xff0c;gdb的使用五&#xff0c;项目…

课时7:Trustzone基础知识

快速链接: . 👉👉👉 个人博客笔记导读目录(全部) 👈👈👈 付费专栏-付费课程 【购买须知】:Secureboot从入门到精通-[目录] 👈👈👈目录 Trustzone安全扩展双系统架构Trustone架构多方位支持的安全

探索Gradio库中的Textbox模块及其强大功能

❤️觉得内容不错的话&#xff0c;欢迎点赞收藏加关注&#x1f60a;&#x1f60a;&#x1f60a;&#xff0c;后续会继续输入更多优质内容❤️ &#x1f449;有问题欢迎大家加关注私戳或者评论&#xff08;包括但不限于NLP算法相关&#xff0c;linux学习相关&#xff0c;读研读博…

作用域、垃圾回收机制、闭包、构造函数

作用域 作用域规定了变量能够被访问的 ‘范围’&#xff0c;离开了这个范围变量便不能被访问 分为&#xff1a; 局部作用域 函数作用域块级作用域 let/const 全局作用域 作用域链 嵌套关系的作用域串联起来形成了作用域链 作用:作用域链本质上是底层的变量的查找机制 函…

简写MKL库windows安装以及python如何调用dll库

MKL安装: 最新MKL库下载地址 Donwload: Accelerate Fast Math with Intel oneAPI Math Kernel Library 64位以及32位我直接都安装了 之后配置各种包含目录以及环境变量&#xff1a;网上有很多配置vs的配置教程&#xff0c;这里就不贴了。 &#xff08;ps: 2023 在vs2019上&a…

nodejs高级编程-核心模块

一、path 1 获取路径中的基础名称 const path require(path)// console.log(__filename) // /Users/liuchongyang/Desktop/分享/网页读取本地文件/node.js// 1 获取路径中的基础名称 /*** 01 返回的就是接收路径当中的最后一部分 * 02 第二个参数表示扩展名&#xff0c;如果…

手把手教-单片机stm32基于w25q128使用文件系统

一、开发测试环境 ①野火stm32f407开发板 ②rtthread操作系统 W25Q128的电路原理图&#xff1a; 二、开发步骤 ①使能spi驱动。 ②使能spi bus/device 驱动&#xff0c;选择sfud驱动。 ③开启dfs功能&#xff0c;选择elm文件系统。 ④保存&#xff0c;重新生成工程。 ⑤下载到…

VueCli 脚手架使用

VueCli 脚手架 到目前为止&#xff0c;已经会了Vue基本使用&#xff08;去创建vue实例&#xff0c;创建之后再去挂载&#xff0c;挂载之后就去使用各种功能&#xff0c;挂载之后就可以使用其各种功能&#xff0c;data methods compute 以及各个生命周期&#xff0c;常用的属性以…

779. 最长公共字符串后缀

题面&#xff1a; 给出若干个字符串&#xff0c;输出这些字符串的最长公共后缀。 输入格式 由若干组输入组成。 每组输入的第一行是一个整数 NN。 NN 为 00 时表示输入结束&#xff0c;否则后面会继续有 NN 行输入&#xff0c;每行是一个字符串&#xff08;字符串内不含空白符&…

Redis深入 —— 持久化和事务

前言 最近的学习中&#xff0c;荔枝深入了解了Redis的持久化、Redis事务相关的知识点并整理相应的学习笔记&#xff0c;在这篇文章中荔枝也主要梳理了相应的笔记和基本知识&#xff0c;小伙伴们如果需要的话可以看看哈。 文章目录 前言 一、Redis持久化 1.1 RDB 1.1.1 Redi…

掌握驱动之道:L298N模块多方式驱动电机的优劣分析

L298N模块是一种常用的直流电机驱动模块&#xff0c;它可以通过控制输入端口来实现对电机的速度和方向的控制。L298N模块有3个输入端口&#xff1a;IN1、IN2和EN。 方法一&#xff1a;使用高级定时器输出通道和互补输出通道控制电机 将模块的IN1和IN2分别连接到STM32高级定时器…

Python GUI编程利器:Tkinker中的事件处理(11)

​ 小朋友们好&#xff0c;大朋友们好&#xff01; 我是猫妹&#xff0c;一名爱上Python编程的小学生。 和猫妹学Python&#xff0c;一起趣味学编程。 今日目标 学习下事件处理的相关知识点&#xff1a; 事件处理四要素 事件序列 事件绑定 今天要实现如下效果&#xff1…

Java在Excel中进行数据分析

摘要&#xff1a;本文由葡萄城技术团队于CSDN原创并首发。转载请注明出处&#xff1a;葡萄城官网&#xff0c;葡萄城为开发者提供专业的开发工具、解决方案和服务&#xff0c;赋能开发者。 前一段时间淘宝出了一个“淘宝人生”的模块&#xff0c;可以看从注册淘宝账号至今的消…

k8s实战3-使用Helm在AKS上发布应用

AKS(Azure Kubenetes Service)是微软云azure上的K8s服务。 主要分为三步 1 连接到AKS 2 用kubectl发布应用 3 用Helm发布应用 1 登录 az login 2 连接dp-npr-dsm-aks&#xff08;Dsm项目的AKS&#xff09; az account set --subscription {{subID}} az aks get-credent…

指针的进阶(一)

目录 1. 字符指针 方法一 方法二 字符指针面试题 2. 指针数组 3. 数组指针 3.1 数组指针的定义 3.2 &数组名VS数组名 3.3 数组指针的使用 4. 数组传参和指针传参 4.1 一维数组传参 4.2 二维数组传参 4.3 一级指针传参 4.4 二级指针传参 5. 函数指针 代码一 代…

Windows用户怎么取消访问共享文件夹的密码

许多Windows系统用户在访问共享文件夹的时候却提示需要输入密码才可访问。这一步给很多人造成了困扰&#xff0c;其实我们可以取消访问共享文件夹密码。请看下面的两个方法。 方法一&#xff1a; 搜索 网络和共享中心。点击 更改高级共享设置。在最底下密码保护的共享那项&…

用C#写汉诺塔问题

假设要将n个圆盘从A->C&#xff0c;中间可以借助B&#xff0c;那么递归思路是这样的&#xff0c;我们先将除最大的一个圆盘外的其它n-1个圆盘从A->B,借助C&#xff0c;然后将最大的一个圆盘搬到C&#xff0c;最后将刚才的n-1个盘子&#xff0c;从B->C借助A&#xff0c…