机器学习/数据分析--通俗语言带你入门决策树(结合分类和回归案例)

news2024/11/16 11:50:41
  • 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
  • 🍖 原作者:K同学啊

前言

  • 机器学习是深度学习和数据分析的基础,接下来将更新常见的机器学习算法
  • 注意:在打数学建模比赛中,机器学习用的也很多,可以一起学习
  • 决策树模型数学原理很复杂,强烈推荐看书,看书,看书,这里推荐《统计学习方法》和《机器学习西瓜书》。
  • 这里只是介绍了决策树组成,但是原理没有详细介绍,后面会出详介绍篇章。
  • 最近开学,更新不太及时,请大家见谅,欢迎收藏 + 点赞 + 关注

文章目录

  • 决策树模型
    • 简介
    • 建立决策树的方法
  • 分类案例
    • 导入数据和数据分析
    • 划分自变量和因变量
    • 模型训练
    • 模型预测结果
  • 回归案例
    • 导入数据
    • 划分数据
    • 创建模型
    • 模型预测与训练
    • 模型评估
    • 树图绘制

决策树模型

简介

定义(统计学习方法):分类决策树模型是一种描述对实例进行分类的树形结构,决策树由节点、有向边组成,节点类型有两种,内部节点和叶子节点,内部节点表示一个特征或者属性,叶子节点表示一个类

决策树与if-then

学过任何语言的人都知道if-else结构,决策树也是这样,如果if满足某一种条件,则归到一类,不满足条件的归到另外一类,如此循环判断,一直到所有特征、属性和类都归类到某一类,最终形成一颗注意:一个原则互斥且完备

决策树过程

特征选择、建立决策树、决策树剪枝三个过程

决策树解决问题

回归和分类,如果分类的叶子节点,就是回归,否则就是分类

建立决策树的方法

在这里插入图片描述

决策树背后由很多的数学原理,这里只介绍信息增益、信息增益比、基尼系数,其他的概念推荐翻阅统计学习方法西瓜书,想要电子版资料的可以私聊我。

建议:这一部分一定要看书,推荐统计学习方法和机器学习西瓜书,书中有很详细的案例帮助我们理解。67y

以下概念均来自于《统计学习方法》

信息增益: 特征A对训练数据集D的信息增益g(D.A),定义为集合D的经验熵H(D)特征A给定条件下D的经验条件H(DA)之,即:

g ( D , A ) = H ( D ) − H ( D ∣ A ) g\left(D,A\right)=H\left(D\right)-H\left(D|A\right) g(D,A)=H(D)H(DA)

信息增益比:特征A对训练数据集D的信息增益比gR(D)定为其信息增益 g(D,A)与训练数据集 D 关于特征 A的值的熵 HA(D)之比。即:

g R ( D , A ) = g ( D , A ) H A ( D ) g_{R}(D,A)=\frac{g(D,A)}{H_{A}(D)} gR(D,A)=HA(D)g(D,A)

其中: H A ( D ) = − ∑ i = 1 n ∣ D i ∣ ∣ D ∣ log ⁡ 2 ∣ D i ∣ ∣ D ∣ H_{A}(D)=-\sum_{i=1}^{n}\frac{\left|D_{i}\right|}{\left|D\right|}\log_{2}\frac{\left|D_{i}\right|}{\left|D\right|} HA(D)=i=1nDDilog2DDi ,n表示特征A的数量。

基尼指数:分类问题中,假设有区个类,样本点属于第k 类的概率为 pk,则概率分布的基尼指数定义为:

G i n i ( p ) = ∑ k = 1 K p k ( 1 − p k ) = 1 − ∑ k = 1 K p k 2 Gini\left(p\right)=\sum_{k=1}^{K}p_{k}\left(1-p_{k}\right)=1-\sum_{k=1}^{K}p_{k}^{2} Gini(p)=k=1Kpk(1pk)=1k=1Kpk2

建议:看书,通过案例和公式来理解。

分类案例

简介:通过鸢尾花的叶子特征,构建判别叶子类别的树。

导入数据和数据分析

import numpy as np 
import pandas as pd 

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data" 
columns = ['花萼-length', '花萼-width', '花瓣-length', '花瓣-width', 'class']

data = pd.read_csv(url, names=columns)
data
花萼-length花萼-width花瓣-length花瓣-widthclass
05.13.51.40.2Iris-setosa
14.93.01.40.2Iris-setosa
24.73.21.30.2Iris-setosa
34.63.11.50.2Iris-setosa
45.03.61.40.2Iris-setosa
..................
1456.73.05.22.3Iris-virginica
1466.32.55.01.9Iris-virginica
1476.53.05.22.0Iris-virginica
1486.23.45.42.3Iris-virginica
1495.93.05.11.8Iris-virginica

150 rows × 5 columns

# 查看值的类别和数量
data['class'].value_counts()

结果:

class
Iris-setosa        50
Iris-versicolor    50
Iris-virginica     50
Name: count, dtype: int64
# 查看变量信息
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   花萼-length  150 non-null    float64
 1   花萼-width   150 non-null    float64
 2   花瓣-length  150 non-null    float64
 3   花瓣-width   150 non-null    float64
 4   class      150 non-null    object 
dtypes: float64(4), object(1)
memory usage: 6.0+ KB
# 查看缺失值
data.isnull().sum()

结果:

花萼-length    0
花萼-width     0
花瓣-length    0
花瓣-width     0
class        0
dtype: int64
# 查看特征的统计变量
data.describe()

结果:

花萼-length花萼-width花瓣-length花瓣-width
count150.000000150.000000150.000000150.000000
mean5.8433333.0540003.7586671.198667
std0.8280660.4335941.7644200.763161
min4.3000002.0000001.0000000.100000
25%5.1000002.8000001.6000000.300000
50%5.8000003.0000004.3500001.300000
75%6.4000003.3000005.1000001.800000
max7.9000004.4000006.9000002.500000
# 查看特征变量的相关性
name_corr = ['花萼-length', '花萼-width', '花瓣-length', '花瓣-width']
corr = data[name_corr].corr()
print(corr)
           花萼-length  花萼-width  花瓣-length  花瓣-width
花萼-length   1.000000 -0.109369   0.871754  0.817954
花萼-width   -0.109369  1.000000  -0.420516 -0.356544
花瓣-length   0.871754 -0.420516   1.000000  0.962757
花瓣-width    0.817954 -0.356544   0.962757  1.000000

说明:特征变量之间存在共线性问题

划分自变量和因变量

X = data.iloc[:, [0, 1, 2, 3]].values   # .values转化成矩阵
y = data.iloc[:, 4].values

模型训练

from sklearn import tree

model = tree.DecisionTreeClassifier()
model.fit(X, y)   # 模型训练
# 打印模型结构
r = tree.export_text(model)

模型预测结果

# 随机选取值
x_test = X[[0, 30, 60, 90, 120, 130], :]
y_pred_prob = model.predict_proba(x_test)   # 预测概率
y_pred = model.predict(x_test)     # 预测值
print("\n===模型===")
print(r)
===模型===
|--- feature_3 <= 0.80
|   |--- class: Iris-setosa
|--- feature_3 >  0.80
|   |--- feature_3 <= 1.75
|   |   |--- feature_2 <= 4.95
|   |   |   |--- feature_3 <= 1.65
|   |   |   |   |--- class: Iris-versicolor
|   |   |   |--- feature_3 >  1.65
|   |   |   |   |--- class: Iris-virginica
|   |   |--- feature_2 >  4.95
|   |   |   |--- feature_3 <= 1.55
|   |   |   |   |--- class: Iris-virginica
|   |   |   |--- feature_3 >  1.55
|   |   |   |   |--- feature_2 <= 5.45
|   |   |   |   |   |--- class: Iris-versicolor
|   |   |   |   |--- feature_2 >  5.45
|   |   |   |   |   |--- class: Iris-virginica
|   |--- feature_3 >  1.75
|   |   |--- feature_2 <= 4.85
|   |   |   |--- feature_0 <= 5.95
|   |   |   |   |--- class: Iris-versicolor
|   |   |   |--- feature_0 >  5.95
|   |   |   |   |--- class: Iris-virginica
|   |   |--- feature_2 >  4.85
|   |   |   |--- class: Iris-virginica
print("\n===测试数据===")
print(x_test)
===测试数据===
[[5.1 3.5 1.4 0.2]
 [4.8 3.1 1.6 0.2]
 [5.  2.  3.5 1. ]
 [5.5 2.6 4.4 1.2]
 [6.9 3.2 5.7 2.3]
 [7.4 2.8 6.1 1.9]]
print("\n===预测所属类别概率===")
print(y_pred_prob)
===预测所属类别概率===
[[1. 0. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [0. 1. 0.]
 [0. 0. 1.]
 [0. 0. 1.]]
print("\n===测试所属类别==")
print(y_pred)
===测试所属类别==
['Iris-setosa' 'Iris-setosa' 'Iris-versicolor' 'Iris-versicolor'
 'Iris-virginica' 'Iris-virginica']

回归案例

通过鸢尾花三个特征,预测花瓣长度

导入数据

import pandas as pd 
import numpy as np 

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['花萼-width', '花萼-length', '花瓣-width', '花瓣-length', 'class']

data = pd.read_csv(url, names=names)
data
花萼-width花萼-length花瓣-width花瓣-lengthclass
05.13.51.40.2Iris-setosa
14.93.01.40.2Iris-setosa
24.73.21.30.2Iris-setosa
34.63.11.50.2Iris-setosa
45.03.61.40.2Iris-setosa
..................
1456.73.05.22.3Iris-virginica
1466.32.55.01.9Iris-virginica
1476.53.05.22.0Iris-virginica
1486.23.45.42.3Iris-virginica
1495.93.05.11.8Iris-virginica

150 rows × 5 columns

划分数据

# 划分数据
X = data.iloc[:, [0, 1, 2]]
y = data.iloc[:, 3]

创建模型

from sklearn import tree 

model = tree.DecisionTreeRegressor()
model.fit(X, y)   # 模型训练

模型预测与训练

x_test = X.iloc[[0, 1, 50, 51, 100, 120], :]
y_test = y.iloc[[0, 1, 50, 51, 100, 120]]   # 只有一列

y_pred = model.predict(x_test)

模型评估

# 输出原始值和真实值
df = pd.DataFrame()
df['原始值'] = y_test 
df['预测值'] = y_pred

df
原始值预测值
00.20.25
10.20.20
501.41.40
511.51.50
1002.52.50
1202.32.30
from sklearn.metrics import mean_absolute_error
# 误差计算
mse = mean_absolute_error(y_test, y_pred)
mse

结果:

0.008333333333333331
# 打印树结构
r = tree.export_text(model)
print(r)
# 树模型结构比较复杂,可以运行后面代码绘图展示。
|--- feature_2 <= 2.45
|   |--- feature_1 <= 3.25
|   |   |--- feature_1 <= 2.60
|   |   |   |--- value: [0.30]
|   |   |--- feature_1 >  2.60
|   |   |   |--- feature_0 <= 4.85
|   |   |   |   |--- feature_0 <= 4.35
|   |   |   |   |   |--- value: [0.10]
|   |   |   |   |--- feature_0 >  4.35
|   |   |   |   |   |--- feature_2 <= 1.35
|   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |--- feature_2 >  1.35
|   |   |   |   |   |   |--- feature_1 <= 2.95
|   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |   |--- feature_1 >  2.95
|   |   |   |   |   |   |   |--- feature_0 <= 4.65
|   |   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |   |   |--- feature_0 >  4.65
|   |   |   |   |   |   |   |   |--- feature_1 <= 3.05
|   |   |   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |   |   |   |--- feature_1 >  3.05
|   |   |   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |--- feature_0 >  4.85
|   |   |   |   |--- feature_0 <= 4.95
|   |   |   |   |   |--- feature_1 <= 3.05
|   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |--- feature_1 >  3.05
|   |   |   |   |   |   |--- value: [0.10]
|   |   |   |   |--- feature_0 >  4.95
|   |   |   |   |   |--- value: [0.20]
|   |--- feature_1 >  3.25
|   |   |--- feature_2 <= 1.55
|   |   |   |--- feature_1 <= 4.30
|   |   |   |   |--- feature_1 <= 3.95
|   |   |   |   |   |--- feature_1 <= 3.85
|   |   |   |   |   |   |--- feature_1 <= 3.65
|   |   |   |   |   |   |   |--- feature_0 <= 5.30
|   |   |   |   |   |   |   |   |--- feature_2 <= 1.45
|   |   |   |   |   |   |   |   |   |--- feature_1 <= 3.55
|   |   |   |   |   |   |   |   |   |   |--- feature_2 <= 1.35
|   |   |   |   |   |   |   |   |   |   |   |--- value: [0.30]
|   |   |   |   |   |   |   |   |   |   |--- feature_2 >  1.35
|   |   |   |   |   |   |   |   |   |   |   |--- truncated branch of depth 3
|   |   |   |   |   |   |   |   |   |--- feature_1 >  3.55
|   |   |   |   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |   |   |   |--- feature_2 >  1.45
|   |   |   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |   |   |--- feature_0 >  5.30
|   |   |   |   |   |   |   |   |--- feature_1 <= 3.45
|   |   |   |   |   |   |   |   |   |--- value: [0.40]
|   |   |   |   |   |   |   |   |--- feature_1 >  3.45
|   |   |   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |   |--- feature_1 >  3.65
|   |   |   |   |   |   |   |--- feature_0 <= 5.20
|   |   |   |   |   |   |   |   |--- feature_1 <= 3.75
|   |   |   |   |   |   |   |   |   |--- value: [0.40]
|   |   |   |   |   |   |   |   |--- feature_1 >  3.75
|   |   |   |   |   |   |   |   |   |--- value: [0.30]
|   |   |   |   |   |   |   |--- feature_0 >  5.20
|   |   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |--- feature_1 >  3.85
|   |   |   |   |   |   |--- value: [0.40]
|   |   |   |   |--- feature_1 >  3.95
|   |   |   |   |   |--- feature_0 <= 5.35
|   |   |   |   |   |   |--- value: [0.10]
|   |   |   |   |   |--- feature_0 >  5.35
|   |   |   |   |   |   |--- value: [0.20]
|   |   |   |--- feature_1 >  4.30
|   |   |   |   |--- value: [0.40]
|   |   |--- feature_2 >  1.55
|   |   |   |--- feature_0 <= 4.90
|   |   |   |   |--- value: [0.20]
|   |   |   |--- feature_0 >  4.90
|   |   |   |   |--- feature_0 <= 5.05
|   |   |   |   |   |--- feature_1 <= 3.45
|   |   |   |   |   |   |--- value: [0.40]
|   |   |   |   |   |--- feature_1 >  3.45
|   |   |   |   |   |   |--- value: [0.60]
|   |   |   |   |--- feature_0 >  5.05
|   |   |   |   |   |--- feature_1 <= 3.35
|   |   |   |   |   |   |--- value: [0.50]
|   |   |   |   |   |--- feature_1 >  3.35
|   |   |   |   |   |   |--- feature_2 <= 1.65
|   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |   |--- feature_2 >  1.65
|   |   |   |   |   |   |   |--- feature_1 <= 3.60
|   |   |   |   |   |   |   |   |--- value: [0.20]
|   |   |   |   |   |   |   |--- feature_1 >  3.60
|   |   |   |   |   |   |   |   |--- feature_0 <= 5.55
|   |   |   |   |   |   |   |   |   |--- value: [0.40]
|   |   |   |   |   |   |   |   |--- feature_0 >  5.55
|   |   |   |   |   |   |   |   |   |--- value: [0.30]
|--- feature_2 >  2.45
|   |--- feature_2 <= 4.75
|   |   |--- feature_2 <= 4.15
|   |   |   |--- feature_1 <= 2.65
|   |   |   |   |--- feature_2 <= 3.95
|   |   |   |   |   |--- feature_2 <= 3.75
|   |   |   |   |   |   |--- feature_2 <= 3.15
|   |   |   |   |   |   |   |--- value: [1.10]
|   |   |   |   |   |   |--- feature_2 >  3.15
|   |   |   |   |   |   |   |--- value: [1.00]
|   |   |   |   |   |--- feature_2 >  3.75
|   |   |   |   |   |   |--- feature_0 <= 5.55
|   |   |   |   |   |   |   |--- value: [1.10]
|   |   |   |   |   |   |--- feature_0 >  5.55
|   |   |   |   |   |   |   |--- value: [1.10]
|   |   |   |   |--- feature_2 >  3.95
|   |   |   |   |   |--- feature_0 <= 5.90
|   |   |   |   |   |   |--- feature_0 <= 5.65
|   |   |   |   |   |   |   |--- feature_1 <= 2.40
|   |   |   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |   |   |   |--- feature_1 >  2.40
|   |   |   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |   |   |--- feature_0 >  5.65
|   |   |   |   |   |   |   |--- value: [1.20]
|   |   |   |   |   |--- feature_0 >  5.90
|   |   |   |   |   |   |--- value: [1.00]
|   |   |   |--- feature_1 >  2.65
|   |   |   |   |--- feature_0 <= 5.75
|   |   |   |   |   |--- feature_0 <= 5.40
|   |   |   |   |   |   |--- value: [1.40]
|   |   |   |   |   |--- feature_0 >  5.40
|   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |--- feature_0 >  5.75
|   |   |   |   |   |--- feature_2 <= 4.05
|   |   |   |   |   |   |--- feature_2 <= 3.95
|   |   |   |   |   |   |   |--- value: [1.20]
|   |   |   |   |   |   |--- feature_2 >  3.95
|   |   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |   |--- feature_2 >  4.05
|   |   |   |   |   |   |--- value: [1.00]
|   |   |--- feature_2 >  4.15
|   |   |   |--- feature_2 <= 4.45
|   |   |   |   |--- feature_0 <= 5.80
|   |   |   |   |   |--- feature_1 <= 2.65
|   |   |   |   |   |   |--- value: [1.20]
|   |   |   |   |   |--- feature_1 >  2.65
|   |   |   |   |   |   |--- feature_1 <= 2.95
|   |   |   |   |   |   |   |--- feature_1 <= 2.80
|   |   |   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |   |   |   |--- feature_1 >  2.80
|   |   |   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |   |   |--- feature_1 >  2.95
|   |   |   |   |   |   |   |--- value: [1.20]
|   |   |   |   |--- feature_0 >  5.80
|   |   |   |   |   |--- feature_1 <= 2.95
|   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |   |--- feature_1 >  2.95
|   |   |   |   |   |   |--- feature_0 <= 6.25
|   |   |   |   |   |   |   |--- value: [1.50]
|   |   |   |   |   |   |--- feature_0 >  6.25
|   |   |   |   |   |   |   |--- value: [1.40]
|   |   |   |--- feature_2 >  4.45
|   |   |   |   |--- feature_0 <= 5.15
|   |   |   |   |   |--- value: [1.70]
|   |   |   |   |--- feature_0 >  5.15
|   |   |   |   |   |--- feature_1 <= 3.25
|   |   |   |   |   |   |--- feature_1 <= 2.95
|   |   |   |   |   |   |   |--- feature_2 <= 4.65
|   |   |   |   |   |   |   |   |--- feature_0 <= 5.85
|   |   |   |   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |   |   |   |   |--- feature_0 >  5.85
|   |   |   |   |   |   |   |   |   |--- feature_0 <= 6.55
|   |   |   |   |   |   |   |   |   |   |--- value: [1.50]
|   |   |   |   |   |   |   |   |   |--- feature_0 >  6.55
|   |   |   |   |   |   |   |   |   |   |--- value: [1.30]
|   |   |   |   |   |   |   |--- feature_2 >  4.65
|   |   |   |   |   |   |   |   |--- feature_1 <= 2.85
|   |   |   |   |   |   |   |   |   |--- value: [1.20]
|   |   |   |   |   |   |   |   |--- feature_1 >  2.85
|   |   |   |   |   |   |   |   |   |--- value: [1.40]
|   |   |   |   |   |   |--- feature_1 >  2.95
|   |   |   |   |   |   |   |--- feature_2 <= 4.55
|   |   |   |   |   |   |   |   |--- value: [1.50]
|   |   |   |   |   |   |   |--- feature_2 >  4.55
|   |   |   |   |   |   |   |   |--- feature_2 <= 4.65
|   |   |   |   |   |   |   |   |   |--- value: [1.40]
|   |   |   |   |   |   |   |   |--- feature_2 >  4.65
|   |   |   |   |   |   |   |   |   |--- feature_1 <= 3.15
|   |   |   |   |   |   |   |   |   |   |--- value: [1.50]
|   |   |   |   |   |   |   |   |   |--- feature_1 >  3.15
|   |   |   |   |   |   |   |   |   |   |--- value: [1.40]
|   |   |   |   |   |--- feature_1 >  3.25
|   |   |   |   |   |   |--- value: [1.60]
|   |--- feature_2 >  4.75
|   |   |--- feature_2 <= 5.05
|   |   |   |--- feature_0 <= 6.75
|   |   |   |   |--- feature_0 <= 5.80
|   |   |   |   |   |--- value: [2.00]
|   |   |   |   |--- feature_0 >  5.80
|   |   |   |   |   |--- feature_1 <= 2.35
|   |   |   |   |   |   |--- value: [1.50]
|   |   |   |   |   |--- feature_1 >  2.35
|   |   |   |   |   |   |--- feature_0 <= 6.25
|   |   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |   |   |--- feature_0 >  6.25
|   |   |   |   |   |   |   |--- feature_2 <= 4.95
|   |   |   |   |   |   |   |   |--- feature_1 <= 2.60
|   |   |   |   |   |   |   |   |   |--- value: [1.50]
|   |   |   |   |   |   |   |   |--- feature_1 >  2.60
|   |   |   |   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |   |   |   |--- feature_2 >  4.95
|   |   |   |   |   |   |   |   |--- feature_0 <= 6.50
|   |   |   |   |   |   |   |   |   |--- value: [1.90]
|   |   |   |   |   |   |   |   |--- feature_0 >  6.50
|   |   |   |   |   |   |   |   |   |--- value: [1.70]
|   |   |   |--- feature_0 >  6.75
|   |   |   |   |--- feature_0 <= 6.85
|   |   |   |   |   |--- value: [1.40]
|   |   |   |   |--- feature_0 >  6.85
|   |   |   |   |   |--- value: [1.50]
|   |   |--- feature_2 >  5.05
|   |   |   |--- feature_1 <= 3.05
|   |   |   |   |--- feature_0 <= 6.35
|   |   |   |   |   |--- feature_0 <= 5.85
|   |   |   |   |   |   |--- feature_1 <= 2.75
|   |   |   |   |   |   |   |--- value: [1.90]
|   |   |   |   |   |   |--- feature_1 >  2.75
|   |   |   |   |   |   |   |--- value: [2.40]
|   |   |   |   |   |--- feature_0 >  5.85
|   |   |   |   |   |   |--- feature_1 <= 2.85
|   |   |   |   |   |   |   |--- feature_1 <= 2.65
|   |   |   |   |   |   |   |   |--- value: [1.40]
|   |   |   |   |   |   |   |--- feature_1 >  2.65
|   |   |   |   |   |   |   |   |--- feature_1 <= 2.75
|   |   |   |   |   |   |   |   |   |--- value: [1.60]
|   |   |   |   |   |   |   |   |--- feature_1 >  2.75
|   |   |   |   |   |   |   |   |   |--- value: [1.50]
|   |   |   |   |   |   |--- feature_1 >  2.85
|   |   |   |   |   |   |   |--- feature_1 <= 2.95
|   |   |   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |   |   |   |--- feature_1 >  2.95
|   |   |   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |--- feature_0 >  6.35
|   |   |   |   |   |--- feature_0 <= 7.50
|   |   |   |   |   |   |--- feature_0 <= 7.15
|   |   |   |   |   |   |   |--- feature_1 <= 2.75
|   |   |   |   |   |   |   |   |--- feature_0 <= 6.55
|   |   |   |   |   |   |   |   |   |--- value: [1.90]
|   |   |   |   |   |   |   |   |--- feature_0 >  6.55
|   |   |   |   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |   |   |   |--- feature_1 >  2.75
|   |   |   |   |   |   |   |   |--- feature_0 <= 6.60
|   |   |   |   |   |   |   |   |   |--- feature_2 <= 5.55
|   |   |   |   |   |   |   |   |   |   |--- feature_2 <= 5.35
|   |   |   |   |   |   |   |   |   |   |   |--- value: [2.00]
|   |   |   |   |   |   |   |   |   |   |--- feature_2 >  5.35
|   |   |   |   |   |   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |   |   |   |   |   |--- feature_2 >  5.55
|   |   |   |   |   |   |   |   |   |   |--- feature_2 <= 5.70
|   |   |   |   |   |   |   |   |   |   |   |--- value: [2.15]
|   |   |   |   |   |   |   |   |   |   |--- feature_2 >  5.70
|   |   |   |   |   |   |   |   |   |   |   |--- value: [2.20]
|   |   |   |   |   |   |   |   |--- feature_0 >  6.60
|   |   |   |   |   |   |   |   |   |--- feature_0 <= 6.75
|   |   |   |   |   |   |   |   |   |   |--- value: [2.30]
|   |   |   |   |   |   |   |   |   |--- feature_0 >  6.75
|   |   |   |   |   |   |   |   |   |   |--- value: [2.10]
|   |   |   |   |   |   |--- feature_0 >  7.15
|   |   |   |   |   |   |   |--- feature_2 <= 5.95
|   |   |   |   |   |   |   |   |--- value: [1.60]
|   |   |   |   |   |   |   |--- feature_2 >  5.95
|   |   |   |   |   |   |   |   |--- feature_1 <= 2.85
|   |   |   |   |   |   |   |   |   |--- value: [1.90]
|   |   |   |   |   |   |   |   |--- feature_1 >  2.85
|   |   |   |   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |   |--- feature_0 >  7.50
|   |   |   |   |   |   |--- feature_2 <= 6.80
|   |   |   |   |   |   |   |--- feature_2 <= 6.35
|   |   |   |   |   |   |   |   |--- value: [2.30]
|   |   |   |   |   |   |   |--- feature_2 >  6.35
|   |   |   |   |   |   |   |   |--- feature_1 <= 2.90
|   |   |   |   |   |   |   |   |   |--- value: [2.00]
|   |   |   |   |   |   |   |   |--- feature_1 >  2.90
|   |   |   |   |   |   |   |   |   |--- value: [2.10]
|   |   |   |   |   |   |--- feature_2 >  6.80
|   |   |   |   |   |   |   |--- value: [2.30]
|   |   |   |--- feature_1 >  3.05
|   |   |   |   |--- feature_1 <= 3.25
|   |   |   |   |   |--- feature_2 <= 5.95
|   |   |   |   |   |   |--- feature_0 <= 6.60
|   |   |   |   |   |   |   |--- feature_2 <= 5.40
|   |   |   |   |   |   |   |   |--- feature_0 <= 6.45
|   |   |   |   |   |   |   |   |   |--- value: [2.30]
|   |   |   |   |   |   |   |   |--- feature_0 >  6.45
|   |   |   |   |   |   |   |   |   |--- value: [2.00]
|   |   |   |   |   |   |   |--- feature_2 >  5.40
|   |   |   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |   |   |--- feature_0 >  6.60
|   |   |   |   |   |   |   |--- feature_2 <= 5.50
|   |   |   |   |   |   |   |   |--- feature_2 <= 5.25
|   |   |   |   |   |   |   |   |   |--- value: [2.30]
|   |   |   |   |   |   |   |   |--- feature_2 >  5.25
|   |   |   |   |   |   |   |   |   |--- value: [2.10]
|   |   |   |   |   |   |   |--- feature_2 >  5.50
|   |   |   |   |   |   |   |   |--- feature_2 <= 5.65
|   |   |   |   |   |   |   |   |   |--- value: [2.40]
|   |   |   |   |   |   |   |   |--- feature_2 >  5.65
|   |   |   |   |   |   |   |   |   |--- value: [2.30]
|   |   |   |   |   |--- feature_2 >  5.95
|   |   |   |   |   |   |--- value: [1.80]
|   |   |   |   |--- feature_1 >  3.25
|   |   |   |   |   |--- feature_0 <= 7.45
|   |   |   |   |   |   |--- feature_2 <= 5.85
|   |   |   |   |   |   |   |--- feature_2 <= 5.65
|   |   |   |   |   |   |   |   |--- feature_2 <= 5.50
|   |   |   |   |   |   |   |   |   |--- value: [2.30]
|   |   |   |   |   |   |   |   |--- feature_2 >  5.50
|   |   |   |   |   |   |   |   |   |--- value: [2.40]
|   |   |   |   |   |   |   |--- feature_2 >  5.65
|   |   |   |   |   |   |   |   |--- value: [2.30]
|   |   |   |   |   |   |--- feature_2 >  5.85
|   |   |   |   |   |   |   |--- feature_0 <= 6.75
|   |   |   |   |   |   |   |   |--- value: [2.50]
|   |   |   |   |   |   |   |--- feature_0 >  6.75
|   |   |   |   |   |   |   |   |--- value: [2.50]
|   |   |   |   |   |--- feature_0 >  7.45
|   |   |   |   |   |   |--- feature_0 <= 7.80
|   |   |   |   |   |   |   |--- value: [2.20]
|   |   |   |   |   |   |--- feature_0 >  7.80
|   |   |   |   |   |   |   |--- value: [2.00]

树图绘制

from sklearn.tree import export_graphviz
import graphviz

#设置字体
from pylab import mpl
mpl.rcParams["font.sans-serif"] = ["SimHei"]  # 显示中文

# 使用export_graphviz生成DOT文件
dot_data = export_graphviz(model, out_file=None, 
                           feature_names=['花萼-width', '花萼-length', '花瓣-width'],  
                           class_names=['花瓣-length'],
                           filled=True, rounded=True,
                           special_characters=True) 

# 使用graphviz渲染DOT文件
graph = graphviz.Source(dot_data)
graph.render("decision_tree") # 将图形保存为PDF或其它格式
graph.view() # 在默认查看器中打开图形

图太长了,不方便展示,可以运行代码绘制。

ue: [2.50]
| | | | | | | |— feature_0 > 6.75
| | | | | | | | |— value: [2.50]
| | | | | |— feature_0 > 7.45
| | | | | | |— feature_0 <= 7.80
| | | | | | | |— value: [2.20]
| | | | | | |— feature_0 > 7.80
| | | | | | | |— value: [2.00]


​    

## 树图绘制

```python
from sklearn.tree import export_graphviz
import graphviz

#设置字体
from pylab import mpl
mpl.rcParams["font.sans-serif"] = ["SimHei"]  # 显示中文

# 使用export_graphviz生成DOT文件
dot_data = export_graphviz(model, out_file=None, 
                           feature_names=['花萼-width', '花萼-length', '花瓣-width'],  
                           class_names=['花瓣-length'],
                           filled=True, rounded=True,
                           special_characters=True) 

# 使用graphviz渲染DOT文件
graph = graphviz.Source(dot_data)
graph.render("decision_tree") # 将图形保存为PDF或其它格式
graph.view() # 在默认查看器中打开图形

图太长了,不方便展示,可以运行代码绘制。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2076380.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

[LeetCode]根据决策树设计代码解决dfs

目录 46. 全排列 - 力扣&#xff08;LeetCode&#xff09; 78. 子集 - 力扣&#xff08;LeetCode&#xff09; 46. 全排列 - 力扣&#xff08;LeetCode&#xff09; 决策树&#xff1a;根据题意可以知道&#xff0c;全排列需要找到数组内元素不重复的所有排列方式&#xff0c…

Java面向接口编程——开发打印机

题目&#xff1a; 墨盒&#xff1a;彩色、黑白 纸张类型&#xff1a;A4、B5 墨盒和纸张都不是打印机厂商提供的 打印机厂商要兼容市场上的墨盒、纸张 墨盒接口&#xff1a; public interface InkBox {String colorInkBox(); // 墨盒颜色} 纸张接口&#xff1a; public i…

Centos 添加双网卡 (生产环境配置记录)

1、在虚拟机中添加网卡2 [rootntpserver network-scripts]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo …

前端(Vue)tagsView(子标签页视图切换) 原理及通用解决方案

文章目录 tagsView 方案总结tagsView 原理分析创建 tags 数据源生成 tagsViewtagsView 国际化处理contextMenu 展示处理contextMenu 事件处理处理 contextMenu 的关闭行为处理基于路由的动态过渡 tagsView 方案总结 整个 tagsView 整体来看就是三块大的内容&#xff1a; tags…

【SAM下游任务微调】TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks

TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks 论文链接&#xff1a;https://arxiv.org/pdf/2408.01835 代码链接&#xff1a;https://github.com/maoyangou147/TS-SAM 一、摘要 基于适配器的微调方法已经被研究用于改进SAM在下游任务上的性能。然而&am…

【2024】Datawhale X 李宏毅苹果书 AI夏令营 Task1

本文是关于李宏毅苹果书“第1章 机器学习基础”学习内容的记录。 1、术语解释 术语解释机器学习&#xff08;Machine Learning, ML&#xff09;机器学习是一种人工智能&#xff08;AI&#xff09;技术&#xff0c;它使计算机能够从数据中学习并做出决策或预测&#xff0c;而无…

【python计算机视觉编程——3.图像到图像的映射】

python计算机视觉编程——3.图像到图像的映射 3.图像到图像的映射3.1 单应性变换3.1.1 直接线性变换算法&#xff08;DLT&#xff09;3.1.2 仿射变换 3.2 图像扭曲3.2.1 图像中的图像3.2.2 分段仿射扭曲3.2.3 图像配准 3.3 创建全景图3.3.1 RANSAC3.3.2 稳健的单应性矩阵估计3.…

【二叉树】LC405-删除二叉搜索树的节点

文章目录 1 删除二叉树的节点思路其他代码参考 1 删除二叉树的节点 https://leetcode.cn/problems/delete-node-in-a-bst/description/ 给定一个二叉搜索树的根节点 root 和一个值 key&#xff0c;删除二叉搜索树中的 key 对应的节点&#xff0c;并保证二叉搜索树的性质不变。…

探索全球设计灵感:六大海外设计平台

海外设计网站对于设计师而言&#xff0c;不仅是灵感的源泉&#xff0c;更是专业成长的加速器。这些平台聚集了全球创意人士&#xff0c;提供了一个分享和发现最新设计趋势的环境。设计师可以通过这些网站学习行业内的创新技术&#xff0c;参与设计挑战&#xff0c;提升个人设计…

End-to-End视觉里程计新突破:从运动模糊图像中精确估计相机姿态

更多优质内容&#xff0c;请关注公众号&#xff1a;智驾机器人技术前线 1.论文信息 论文标题&#xff1a;MBRVO: A Blur Robust Visual Odometry Based on Motion Blurred Artifact Prior 作者&#xff1a;Jialu Zhang, Jituo Li*, Jiaqi Li, Yue Sun, Xinqi Liu, Zhi Zheng,…

饮水机复杂交互功能联网调试

饮水机复杂交互功能联网调试 引言 饮水机我们从最开始的放水和加热, 逐渐拓展到保温功能, 童锁功能, 红外检测功能, 对于这些复杂的交互功能, 我们如果通过按键进行调试, 会极大的增加我们的开发时间和成本, 如果我们频繁的进行烧录, 则如果涉及到一些中间变量, 则无法进行调试…

帆软BI怎么制作不等宽柱状图

帆软BI怎么制作不等宽柱状图 文章目录 帆软BI怎么制作不等宽柱状图不等宽柱状图起源一、怎么做不等宽柱状图准备二、操作步骤1.展示效果2.操作步骤-3.操作步骤 -图形属性4.操作步骤 -组件样式5.操作步骤 -横轴和纵轴6.完成7.不等宽柱状图与传统等宽柱状图对比 总结 不等宽柱状图…

【深度学习】嘿马深度学习笔记第5篇:神经网络与tf.keras,学习目标【附代码文档】

本教程的知识点为&#xff1a;深度学习介绍 1.1 深度学习与机器学习的区别 TensorFlow介绍 2.4 张量 2.4.1 张量(Tensor) 2.4.1.1 张量的类型 TensorFlow介绍 1.2 神经网络基础 1.2.1 Logistic回归 1.2.1.1 Logistic回归 TensorFlow介绍 总结 每日作业 神经网络与tf.keras 1.3 …

介绍云计算在医疗领域的应用实例

云计算在医疗领域的应用日益广泛&#xff0c;为医疗行业带来了诸多便利和创新。以下是几个典型的应用实例&#xff1a; 电子病历管理系统&#xff1a; 基于云计算技术的电子病历管理系统&#xff0c;通过互联网实现对病历数据的存储、管理、维护和查询等功能。这类系统能够自动…

关于助记词,词库的讨论

我有个想法&#xff0c;既然私钥碰撞的难度大。 -seed-&#xff08;pathmasterkey&#xff09;-privatekey-publickey-address 通过反推的难度大&#xff0c;那我可以尝试使用助记词碰撞 就例如&#xff0c;我生成1000个eth地址 1000个地址的助记词全部拿到&#xff0c;然后…

C++学习笔记——打印ASCII码

一、题目描述 二、代码 #include <iostream> using namespace std; int main() {char a_char;int a_int;cin >> a_char;a_int a_char;cout << a_int;return 0; }

【数据结构】关于TreeMap与TreeSet的使用你了解多少???

前言&#xff1a; &#x1f31f;&#x1f31f;本期讲解TreeMap与Set的相关知识&#xff0c;希望能帮到屏幕前的你。 &#x1f308;上期博客在这里&#xff1a;http://t.csdnimg.cn/K1moi &#x1f308;感兴趣的小伙伴看一看小编主页&#xff1a;GGBondlctrl-CSDN博客 目录 &am…

开源模型应用落地-LangChain实用小技巧-使用各种Loader高效解析不同数据源(七)

一、前言 在 LangChain框架中&#xff0c;提供了Loader机制&#xff0c;以统一的方式来从各种数据源获取数据&#xff0c;使得开发人员可以方便地集成不同类型的数据源&#xff0c;而无需为每种数据源编写特定的加载代码。它可以将不同格式的数据转换为 LangChain 可以处理的统…

如何使用ssm实现新锐台球厅管理系统的设计与实现+vue

TOC ssm221新锐台球厅管理系统的设计与实现vue 系统概述 1.1 研究背景 如今互联网高速发展&#xff0c;网络遍布全球&#xff0c;通过互联网发布的消息能快而方便的传播到世界每个角落&#xff0c;并且互联网上能传播的信息也很广&#xff0c;比如文字、图片、声音、视频等…

Python(R)均方根误差平均绝对误差导图

&#x1f3af;要点 回归模型评估指标评估薪水预测模型评估员工倦怠率模型评估大气分析生成式对抗模型目标对象缺失下&#xff0c;性能估算法追踪模型误差指标降尺度大气学模拟模型准确性评估蛋白染色质相互作用模型评估 Python回归误差指标 平均绝对误差表示数据集中实际值和…