感知机算法
二分类的情况
原理
样本集
X
\pmb X
X有两个类情况,感知机
Y
=
w
X
+
b
Y=wX+b
Y=wX+b可以将样本集
X
\pmb X
X分为成功两类
Y
=
w
X
+
b
=
{
>
0
,
x
∈
w
1
<
0
,
x
∈
w
2
Y=wX+b= \begin{cases}>0 \quad,\quad x \in w_1 \\ <0 \quad,\quad x \in w_2 \end{cases}
Y=wX+b={>0,x∈w1<0,x∈w2
为了简化
Y
=
W
X
+
b
Y=WX+b
Y=WX+b的形式
可以令
X
′
=
[
x
1
x
2
⋮
x
n
1
]
W
=
[
w
1
,
w
2
,
…
,
w
n
,
b
]
X'=\begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n\\ 1 \end{bmatrix} \\ W=\begin{bmatrix} w_1,w_2,\dots,w_n,b \end{bmatrix}
X′=
x1x2⋮xn1
W=[w1,w2,…,wn,b]
于是存在
Y
=
W
X
Y=WX
Y=WX
同时让$\forall X, X_i \in w_2; X=-X ,于是当 ,于是当 ,于是当Y=WX>0 时,分类正确,若分类错误,则 时,分类正确,若分类错误,则 时,分类正确,若分类错误,则W=W+Cx_i ,更新 ,更新 ,更新W 的值。当且仅当对样本集 的值。当且仅当对样本集 的值。当且仅当对样本集\pmb X 全部分类正确。 全部分类正确。 全部分类正确。W$更新完成。
示例
import numpy as np
x1T = np.array([0, 0, 0])
x2T = np.array([1, 0, 0])
x3T = np.array([1, 0, 1])
x4T = np.array([1, 1, 0])
x5T = np.array([0, 0, 1])
x6T = np.array([0, 1, 1])
x7T = np.array([0, 1, 0])
x8T = np.array([1, 1, 1])
w1 = np.array([x1T, x2T, x3T, x4T]).T
w2 = np.array([x5T, x6T, x7T, x8T]).T
b = np.array([[1, 1, 1, 1]])
w1 = np.vstack((w1, b))
w2 = np.vstack((w2, b))
w2 = -1 * w2
c = 1
w = np.array([0, 0, 0, 0])
epoch = 1
while True:
flag = True
i = 1
for x in w1.T:
res = w @ x
print("epoch:{},x{}:{},w:{},w@x{}:{}".format(epoch, i, x, w, i, res), end='')
i = i + 1
if res <= 0:
w = w + c * x
print(",update:w=w+cx{}:{}".format(i, w))
flag = False
else:
print()
for x in w2.T:
res = w @ x
print("epoch:{},x{}:{},w:{},w@x{}:{}".format(epoch, i, x, w, i, res), end='')
i = i + 1
if res <= 0:
w = w + c * x
print(",update:w=w+cx{}:{}".format(i, w))
flag = False
else:
print()
epoch = epoch + 1
if flag:
break
输出结果:
epoch:1,x1:[0 0 0 1],w:[0 0 0 0],w@x1:0,update:w=w+cx2:[0 0 0 1]
epoch:1,x2:[1 0 0 1],w:[0 0 0 1],w@x2:1
epoch:1,x3:[1 0 1 1],w:[0 0 0 1],w@x3:1
epoch:1,x4:[1 1 0 1],w:[0 0 0 1],w@x4:1
epoch:1,x5:[ 0 0 -1 -1],w:[0 0 0 1],w@x5:-1,update:w=w+cx6:[ 0 0 -1 0]
epoch:1,x6:[ 0 -1 -1 -1],w:[ 0 0 -1 0],w@x6:1
epoch:1,x7:[ 0 -1 0 -1],w:[ 0 0 -1 0],w@x7:0,update:w=w+cx8:[ 0 -1 -1 -1]
epoch:1,x8:[-1 -1 -1 -1],w:[ 0 -1 -1 -1],w@x8:3
epoch:2,x1:[0 0 0 1],w:[ 0 -1 -1 -1],w@x1:-1,update:w=w+cx2:[ 0 -1 -1 0]
epoch:2,x2:[1 0 0 1],w:[ 0 -1 -1 0],w@x2:0,update:w=w+cx3:[ 1 -1 -1 1]
epoch:2,x3:[1 0 1 1],w:[ 1 -1 -1 1],w@x3:1
epoch:2,x4:[1 1 0 1],w:[ 1 -1 -1 1],w@x4:1
epoch:2,x5:[ 0 0 -1 -1],w:[ 1 -1 -1 1],w@x5:0,update:w=w+cx6:[ 1 -1 -2 0]
epoch:2,x6:[ 0 -1 -1 -1],w:[ 1 -1 -2 0],w@x6:3
epoch:2,x7:[ 0 -1 0 -1],w:[ 1 -1 -2 0],w@x7:1
epoch:2,x8:[-1 -1 -1 -1],w:[ 1 -1 -2 0],w@x8:2
epoch:3,x1:[0 0 0 1],w:[ 1 -1 -2 0],w@x1:0,update:w=w+cx2:[ 1 -1 -2 1]
epoch:3,x2:[1 0 0 1],w:[ 1 -1 -2 1],w@x2:2
epoch:3,x3:[1 0 1 1],w:[ 1 -1 -2 1],w@x3:0,update:w=w+cx4:[ 2 -1 -1 2]
epoch:3,x4:[1 1 0 1],w:[ 2 -1 -1 2],w@x4:3
epoch:3,x5:[ 0 0 -1 -1],w:[ 2 -1 -1 2],w@x5:-1,update:w=w+cx6:[ 2 -1 -2 1]
epoch:3,x6:[ 0 -1 -1 -1],w:[ 2 -1 -2 1],w@x6:2
epoch:3,x7:[ 0 -1 0 -1],w:[ 2 -1 -2 1],w@x7:0,update:w=w+cx8:[ 2 -2 -2 0]
epoch:3,x8:[-1 -1 -1 -1],w:[ 2 -2 -2 0],w@x8:2
epoch:4,x1:[0 0 0 1],w:[ 2 -2 -2 0],w@x1:0,update:w=w+cx2:[ 2 -2 -2 1]
epoch:4,x2:[1 0 0 1],w:[ 2 -2 -2 1],w@x2:3
epoch:4,x3:[1 0 1 1],w:[ 2 -2 -2 1],w@x3:1
epoch:4,x4:[1 1 0 1],w:[ 2 -2 -2 1],w@x4:1
epoch:4,x5:[ 0 0 -1 -1],w:[ 2 -2 -2 1],w@x5:1
epoch:4,x6:[ 0 -1 -1 -1],w:[ 2 -2 -2 1],w@x6:3
epoch:4,x7:[ 0 -1 0 -1],w:[ 2 -2 -2 1],w@x7:1
epoch:4,x8:[-1 -1 -1 -1],w:[ 2 -2 -2 1],w@x8:1
epoch:5,x1:[0 0 0 1],w:[ 2 -2 -2 1],w@x1:1
epoch:5,x2:[1 0 0 1],w:[ 2 -2 -2 1],w@x2:3
epoch:5,x3:[1 0 1 1],w:[ 2 -2 -2 1],w@x3:1
epoch:5,x4:[1 1 0 1],w:[ 2 -2 -2 1],w@x4:1
epoch:5,x5:[ 0 0 -1 -1],w:[ 2 -2 -2 1],w@x5:1
epoch:5,x6:[ 0 -1 -1 -1],w:[ 2 -2 -2 1],w@x6:3
epoch:5,x7:[ 0 -1 0 -1],w:[ 2 -2 -2 1],w@x7:1
epoch:5,x8:[-1 -1 -1 -1],w:[ 2 -2 -2 1],w@x8:1
多分类的情况
原理
若样本集 X \pmb X X有M种分类情况,且每一种分类情况都存在一个判别函数 y = w 1 x 1 + b 1 y=w_1x_1+b_1 y=w1x1+b1,同时简化判别函数为 Y 1 = W 1 X 1 Y_1=W_1X_1 Y1=W1X1
若 y k = m a x ( y 1 , y 2 , … , y n ) y_k=max(y_1,y_2,\dots,y_n) yk=max(y1,y2,…,yn),则可视为感知机将样本分为第k类
若原本为第
i
i
i类分类成第
j
j
j类,即
y
i
<
y
j
y_i<y_j
yi<yj,则更新
w
w
w值
w
i
=
w
i
+
c
x
i
w
j
=
w
j
−
c
x
i
w_i=w_i+cx_i\\ w_j=w_j-cx_i
wi=wi+cxiwj=wj−cxi
当且仅当对样本集
X
\pmb X
X全部分类正确。
W
W
W更新完成。
示例
import numpy as np
x1 = np.array([0, 0])
x2 = np.array([1, 1])
x3 = np.array([-1, 1])
b = np.array([[1, 1, 1]])
X = np.array([x1, x2, x3]).T
X = np.vstack((X, b)).T
W = np.zeros((3, 3))
h = X.shape[1]
c = 1
epoch = 1
while True:
flag = True
for i in range(h):
for j in range(h):
res1 = X[i] @ W[i]
res2 = X[i] @ W[j]
if res1 <= res2 and i != j:
W[i] = W[i] + c * X[i]
W[j] = W[j] - c * X[i]
flag = False
print("epoch={},X[{}]@W[{}]={} <= X[{}]@W[{}]={}".format(epoch, i, i, res1, i, j, res2))
print("update W:")
print(" W[{}]=W[{}]+c*X[{}]={}".format(i, i, i, W[i]))
print(" W[{}]=W[{}]-c*X[{}]={}".format(j, j, i, W[j]))
elif i != j:
print("epoch={},X[{}]@W[{}]={} > X[{}]@W[{}]={}".format(epoch, i, i, res1, i, j, res2))
print('W does not need to be updated!')
epoch = epoch + 1
if flag:
print('-' * 10, 'W', '-' * 10)
print(W)
break
输出结果:
epoch=1,X[0]@W[0]=0.0 <= X[0]@W[1]=0.0
update W:
W[0]=W[0]+c*X[0]=[0. 0. 1.]
W[1]=W[1]-c*X[0]=[ 0. 0. -1.]
epoch=1,X[0]@W[0]=1.0 > X[0]@W[2]=0.0
W does not need to be updated!
epoch=1,X[1]@W[1]=-1.0 <= X[1]@W[0]=1.0
update W:
W[1]=W[1]+c*X[1]=[1. 1. 0.]
W[0]=W[0]-c*X[1]=[-1. -1. 0.]
epoch=1,X[1]@W[1]=2.0 > X[1]@W[2]=0.0
W does not need to be updated!
epoch=1,X[2]@W[2]=0.0 <= X[2]@W[0]=0.0
update W:
W[2]=W[2]+c*X[2]=[-1. 1. 1.]
W[0]=W[0]-c*X[2]=[ 0. -2. -1.]
epoch=1,X[2]@W[2]=3.0 > X[2]@W[1]=0.0
W does not need to be updated!
epoch=2,X[0]@W[0]=-1.0 <= X[0]@W[1]=0.0
update W:
W[0]=W[0]+c*X[0]=[ 0. -2. 0.]
W[1]=W[1]-c*X[0]=[ 1. 1. -1.]
epoch=2,X[0]@W[0]=0.0 <= X[0]@W[2]=1.0
update W:
W[0]=W[0]+c*X[0]=[ 0. -2. 1.]
W[2]=W[2]-c*X[0]=[-1. 1. 0.]
epoch=2,X[1]@W[1]=1.0 > X[1]@W[0]=-1.0
W does not need to be updated!
epoch=2,X[1]@W[1]=1.0 > X[1]@W[2]=0.0
W does not need to be updated!
epoch=2,X[2]@W[2]=2.0 > X[2]@W[0]=-1.0
W does not need to be updated!
epoch=2,X[2]@W[2]=2.0 > X[2]@W[1]=-1.0
W does not need to be updated!
epoch=3,X[0]@W[0]=1.0 > X[0]@W[1]=-1.0
W does not need to be updated!
epoch=3,X[0]@W[0]=1.0 > X[0]@W[2]=0.0
W does not need to be updated!
epoch=3,X[1]@W[1]=1.0 > X[1]@W[0]=-1.0
W does not need to be updated!
epoch=3,X[1]@W[1]=1.0 > X[1]@W[2]=0.0
W does not need to be updated!
epoch=3,X[2]@W[2]=2.0 > X[2]@W[0]=-1.0
W does not need to be updated!
epoch=3,X[2]@W[2]=2.0 > X[2]@W[1]=-1.0
W does not need to be updated!
---------- W ----------
[[ 0. -2. 1.]
[ 1. 1. -1.]
[-1. 1. 0.]]