🎯要点
- 优化损失函数评估指标
- 海岸线检测算法评估
- 遥感视觉表征和文本增强
- 乳腺癌预测模型算法
- 液体中闪烁光和切伦科夫光分离
- 多标签分类任务性能评估
- 有向无环图、多路径标记和非强制叶节点预测二元分类评估
- 特征归因可信性评估
- 马修斯相关系数对比其他准确度
Python桑基图混淆矩阵
桑基图是一种数据可视化技术或流程图,强调从一种状态到另一种状态或从一个时间到另一个时间的流动/移动/变化,其中箭头的宽度与所描绘的广泛属性的流速成正比。桑基图还可以可视化能源账户、区域或国家层面的物质流账户以及成本细目。该图表通常用于物质流分析的可视化。桑基图强调系统内的主要转移或流动。它们有助于确定流动中最重要的贡献。它们通常显示定义的系统边界内的守恒量。
Python桑基图和混淆矩阵
import pandas as pd
import numpy as np
from plotly import graph_objects as go
RED = "rgba(245,173,168,0.6)"
GREEN = "rgba(211,255,216,0.6)"
def create_df_from_confusion_matrix(confusion_matrix, class_labels=None):
if not len(class_labels):
df = pd.DataFrame(data=confusion_matrix,
index=[f"True Class-{i+1}" for i in range(confusion_matrix.shape[0])],
columns=[f"Predicted Class-{i+1}" for i in range(confusion_matrix.shape[0])])
else:
df = pd.DataFrame(data=confusion_matrix,
index=[f"True {i}" for i in class_labels],
columns=[f"Predicted {i}" for i in class_labels])
df = df.stack().reset_index()
df.rename(columns={0:'instances', 'level_0':'actual', 'level_1':'predicted'}, inplace=True)
df["colour"] = df.apply(lambda x:
GREEN if x.actual.split()[1:] == x.predicted.split()[1:]
else RED, axis=1)
node_labels = pd.concat([df.actual, df.predicted]).unique()
node_labels_indices = {label:index for index, label in enumerate(node_labels)}
df = df.assign(actual = df.actual.apply(lambda x: node_labels_indices[x]),
predicted = df.predicted.apply(lambda x: node_labels_indices[x]))
def get_link_text(row):
if row["colour"] == GREEN:
instance_count = row["instances"]
source_class = ' '.join(node_labels[row['actual']].split()[1:])
target_class = ' '.join(node_labels[row['predicted']].split()[1:])
return f"{instance_count} {source_class} instances correctly classified as {target_class}"
else:
instance_count = row["instances"]
source_class = ' '.join(node_labels[row['actual']].split()[1:])
target_class = ' '.join(node_labels[row['predicted']].split()[1:])
return f"{instance_count} {source_class} instances incorrectly classified as {target_class}"
df["link_text"] = df.apply(get_link_text, axis = 1)
return df, node_labels
根据混淆矩阵和类别标签绘制桑基图
def plot_confusion_matrix_as_sankey(confusion_matrix, class_labels = None):
df, labels = create_df_from_confusion_matrix(confusion_matrix, class_labels)
fig = go.Figure(data=[go.Sankey(
node = dict(
pad = 20,
thickness = 20,
line = dict(color = "gray", width = 1.0),
label = labels,
hovertemplate = "%{label} has total %{value:d} instances<extra></extra>"
),
link = dict(
source = df.actual,
target = df.predicted,
value = df.instances,
color = df.colour,
customdata = df['link_text'],
hovertemplate = "%{customdata}<extra></extra>"
))])
fig.update_layout(title_text="Confusion Matrix Sankey Diagram", font_size=15,
width=500, height=400)
return fig
confusion_matrix = np.array([[10, 4],
[2, 20]])
plot_confusion_matrix_as_sankey(confusion_matrix, ['Fraud', 'Legit'])