前向传播Forward propagation
前向传播算法就是: 将上一层的输出作为下一层的输入,并计算下一层的输出,一直到运算到输出层为止
在正式介绍前向传播前,先简单介绍计算图(Computational Graph)的概念。
y = w ∗ x + b \mathrm{y}=\mathrm{w} * \mathrm{x}+\mathrm{b} y=w∗x+b
可以用下面的有向无环图表示。
假设一个三层的神经网络,有两个输入和一个输出,每一个层都是全连接层和激活函数层
第一层输入为
Y ( 1 ) = X ( 1 ) W ( 1 ) + b ( 1 ) Y^{(1)}=X^{(1)} W^{(1)}+b^{(1)} Y(1)=X(1)W(1)+b(1)
[ y 1 , 1 ( 1 ) y 1 , 2 ( 1 ) y 1 , 3 ( 1 ) y 1 , 4 ( 1 ) ] = [ x 1 , 1 ( 1 ) x 1 , 2 ( 1 ) ] [ w 1 , 1 ( 1 ) w 1 , 2 ( 1 ) w 1 , 3 ( 1 ) w 1 , 4 ( 1 ) w 2 , 1 ( 1 ) w 2 , 2 ( 1 ) w 2 , 3 ( 1 ) w 2 , 4 ( 1 ) ] + [ b 1 , 1 ( 1 ) b 1 , 2 ( 1 ) b 1 , 3 ( 1 ) b 1 , 4 ( 1 ) ] \left[\begin{array}{llll}y_{1,1}^{(1)} & y_{1,2}^{(1)} & y_{1,3}^{(1)} & y_{1,4}^{(1)}\end{array}\right]=\left[\begin{array}{ll}x_{1,1}^{(1)} & x_{1,2}^{(1)}\end{array}\right]\left[\begin{array}{llll}w_{1,1}^{(1)} & w_{1,2}^{(1)} & w_{1,3}^{(1)} & w_{1,4}^{(1)} \\ w_{2,1}^{(1)} & w_{2,2}^{(1)} & w_{2,3}^{(1)} & w_{2,4}^{(1)}\end{array}\right]+\left[\begin{array}{llll}b_{1,1}^{(1)} & b_{1,2}^{(1)} & b_{1,3}^{(1)} & b_{1,4}^{(1)}\end{array}\right] [y1,1(1)y1,2(1)y1,3(1)y1,4(1)]=[x1,1(1)x1,2(1)][w1,1(1)w2,1(1)w1,2(1)w2,2(1)w1,3(1)w2,3(1)w1,4(1)w2,4(1)]+[b1,1(1)b1,2(1)b1,3(1)b1,4(1)]
其中w和b的数值是神经网络通过学习得到的
之后我们可以得到第一层的输出,第一全连接层的运算过程
[ 1.12 1.28 0.32 − 0.36 ] = [ 0.4 0.6 ] [ 1.1 − 0.3 − 0.1 − 0.6 − 0.2 0.5 1.1 − 0.2 ] + [ 0.8 1.1 − 0.3 0.0 ] \left[\begin{array}{llll}1.12 & 1.28 & 0.32 & -0.36\end{array}\right]=\left[\begin{array}{ll}0.4 & 0.6\end{array}\right]\left[\begin{array}{cccc}1.1 & -0.3 & -0.1 & -0.6 \\ -0.2 & 0.5 & 1.1 & -0.2\end{array}\right]+\left[\begin{array}{llll}0.8 & 1.1 & -0.3 & 0.0\end{array}\right] [1.121.280.32−0.36]=[0.40.6][1.1−0.2−0.30.5−0.11.1−0.6−0.2]+[0.81.1−0.30.0]
接着是激活函数层,激活函数用于提供非线性性,我们以ReLU函数为例
https://baike.baidu.com/item/ReLU 函数/22689567
y = ReLU ( x ) = { 0 , x < 0 x , x ⩾ 0 y=\operatorname{ReLU}(x)=\left\{\begin{array}{ll}0, & x<0 \\ x, & x \geqslant 0\end{array}\right. y=ReLU(x)={0,x,x<0x⩾0
ReLU函数相当于保留正值,将负值强制置0
第二层输出为
Y ( 2 ) = X ( 2 ) W ( 2 ) + b ( 2 ) Y^{(2)}=X^{(2)} W^{(2)}+b^{(2)} Y(2)=X(2)W(2)+b(2)
[ y 1 , 1 ( 2 ) y 1 , 2 ( 2 ) y 1 , 3 ( 2 ) y 1 , 4 ( 2 ) ] = [ x 1 , 1 ( 2 ) x 1 , 2 ( 2 ) x 1 , 3 ( 2 ) x 1 , 4 ( 2 ) ] [ w 1 , 1 ( 2 ) w 1 , 2 ( 2 ) w 1 , 3 ( 2 ) w 1 , 4 ( 2 ) w 2 , 1 ( 2 ) w 2 , 2 ( 2 ) w 2 , 3 ( 2 ) w 2 , 4 ( 2 ) w 3 , 1 ( 2 ) w 3 , 2 ( 2 ) w 3 , 3 ( 2 ) w 3 , 4 ( 2 ) w 4 , 1 ( 2 ) w 4 , 2 ( 2 ) w 4 , 3 ( 2 ) w 4 , 4 ( 2 ) ] + [ b 1 , 1 ( 2 ) b 1 , 2 ( 2 ) b 1 , 3 ( 2 ) b 1 , 4 ( 2 ) ] \left[\begin{array}{lllll}y_{1,1}^{(2)} & y_{1,2}^{(2)} & y_{1,3}^{(2)} & y_{1,4}^{(2)}\end{array}\right]=\left[\begin{array}{llll}x_{1,1}^{(2)} & x_{1,2}^{(2)} & x_{1,3}^{(2)} & x_{1,4}^{(2)}\end{array}\right]\left[\begin{array}{cccc}w_{1,1}^{(2)} & w_{1,2}^{(2)} & w_{1,3}^{(2)} & w_{1,4}^{(2)} \\ w_{2,1}^{(2)} & w_{2,2}^{(2)} & w_{2,3}^{(2)} & w_{2,4}^{(2)} \\ w_{3,1}^{(2)} & w_{3,2}^{(2)} & w_{3,3}^{(2)} & w_{3,4}^{(2)} \\ w_{4,1}^{(2)} & w_{4,2}^{(2)} & w_{4,3}^{(2)} & w_{4,4}^{(2)}\end{array}\right]+\left[\begin{array}{llll}b_{1,1}^{(2)} & b_{1,2}^{(2)} & b_{1,3}^{(2)} & b_{1,4}^{(2)}\end{array}\right] [y1,1(2)y1,2(2)y1,3(2)y1,4(2)]=[x1,1(2)x1,2(2)x1,3(2)x1,4(2)]⎣⎢⎢⎢⎡w1,1(2)w2,1(2)w3,1(2)w4,1(2)w1,2(2)w2,2(2)w3,2(2)w4,2(2)w1,3(2)w2,3(2)w3,3(2)w4,3(2)w1,4(2)w2,4(2)w3,4(2)w4,4(2)⎦⎥⎥⎥⎤+[b1,1(2)b1,2(2)b1,3(2)b1,4(2)]
x2的值由上一层的输出得到,最后得到第二全连接层的输出
在经过激活函数层
最后一层为输出层
[ y 1 , 1 ( 3 ) ] = [ x 1 , 1 ( 3 ) x 1 , 2 ( 3 ) x 1 , 3 ( 3 ) x 1 , 4 ( 3 ) ] [ w 1 , 1 ( 3 ) w 2 , 1 ( 3 ) w 3 , 1 ( 3 ) w 4 , 1 ( 3 ) ] + [ b 1 , 1 ( 3 ) ] \left[\begin{array}{l}y_{1,1}^{(3)}\end{array}\right]=\left[\begin{array}{llll}x_{1,1}^{(3)} & x_{1,2}^{(3)} & x_{1,3}^{(3)} & x_{1,4}^{(3)}\end{array}\right]\left[\begin{array}{l}w_{1,1}^{(3)} \\ w_{2,1}^{(3)} \\ w_{3,1}^{(3)} \\ w_{4,1}^{(3)}\end{array}\right]+\left[b_{1,1}^{(3)}\right] [y1,1(3)]=[x1,1(3)x1,2(3)x1,3(3)x1,4(3)]⎣⎢⎢⎢⎡w1,1(3)w2,1(3)w3,1(3)w4,1(3)⎦⎥⎥⎥⎤+[b1,1(3)]
最终输出为
这就是整个网络的前向传播过程