深度学习 --- VGG16卷积核的可视化(JupyterNotebook实战)

VGG16卷积核的可视化

在前一篇文章中，我对VGG16输入了一张图像，并实现了VGG16各层feature map的可视化。
深度学习 --- VGG16各层feature map可视化(JupyterNotebook实战)-CSDN博客文章浏览阅读615次，点赞13次，收藏15次。在VGG16模型中输入任意一张图片VGG16模型就能给出预测结果，但为什么会得到这个预测结果，通过观察每层的feature map或许有助于我们更好的理解模型。https://blog.csdn.net/daduzimama/article/details/140279255

在这篇文章中，我会可视化各层学习到的卷积核（基于imagenet数据库预先训练好的），这是帮助我们进一步了解VGG16这个黑箱子的另一个角度。

1，通过tensorflow导入keras库

import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input, decode_predictions

2，创建基于ImageNet预先训练好的VGG16模型

model_vgg16=VGG16(weights='imagenet',include_top=True)

#获取总层数
total_layer_num=len(model_vgg16.layers)
print(f"total num of layers={total_layer_num}")
model_vgg16.summary()

该模型总共有23层，只在conv层有filter。 池化层(Pooling layer)的filter是预先设置好的，而不是通过训练数据学习得到的。

3，获取已经训练好的各层卷积核

W=model_vgg16.get_weights()
W_size=len(W)
print(f"卷积核的总层数={W_size}")
for i in range(W_size):
    print("size of W",i,":",W[i].shape)

单从模型本身来看包含conv的卷积层只有13个，但是log中给出的权重层共有32个，这是为什么呢？

model给出的32权重列表是由于13个卷积层再加上两个全连接层和一个softmax分类器层，共16个。而每层所对应的bias也单独保存为一层，因此是16x2=32层。而在这篇文章中我们主要关注的重点是在前面的13个卷积核。

4，访问指定的卷积核

4，1 block1_conv1层的64个3x3x3的filter

W[0].shape

卷积核的自身大小为3x3。因为，输入图像的通道数为3，所以，filter的尺寸为3x3x3。

4，2 访问block1_conv1层64个3x3x3的filter中的第0组filter中的第0个filter

# 获得block1_conv1层的第0组filter
block1_conv1_filter0=W[0][:,:,:,0]
print("size of block1_conv1_filter0 =",block1_conv1_filter0.shape)
print("filter0_filter0:\n",block1_conv1_filter0[0])
print("weight sum of filter =",np.sum(block1_conv1_filter0[0]))

这里有一个问题，在传统的图像算法领域，比较常用卷积核的所有权重的和一般都是1。这是因为，在传统的图像处理算法中，例如平滑、锐化，等，为了确保经过卷积处理后的图像的亮度/能量不能发生变化。也就是要保证在传统算法中，卷积后的图像亮度不能比原始图像变得更亮或者更暗，因此，人们在设计卷积核的时候会刻意保证让设计出来的卷积核的和为1。

但在这个例子中，卷积核的和是2.47大于1了。为什么会这样呢？

VGG16中的所有卷积核都是通过学习得到的，更为详细的说是为了使损失函数最小通过反向传播算法自然而然的学习到的，而且卷积核在学习的时候并没有要求卷积核的和为1的这个约束。

4，3 绘制指定卷积核

plt.figure()
plt.imshow(block1_conv1_filter0[0])
plt.title("block1_conv1_filter0_filter0")

4，4 访问block1_conv1层所对应的偏置项

有几个filter就对应几个偏置项，block1_conv1层总共有64个卷积核，因此，相应的有64个bias。

print(W[1].shape)
print("bias of block1_conv1:\n",W[1])
plt.figure()
plt.stem(W[1])

5，各层全部卷积核的可视化

5，1 可视化block1_conv1的前64个卷积核

block1_conv1总共有64个卷积核，输入尺寸为224x224x3，每个卷积核的尺寸是3x3x3，输出的feature map是224x224x64。由于卷积核的通道数为3，imshow函数无法显示这一维度的图像，因此，下面所显示的64个卷积核都是通道0的图像。

n=8
lay=0

block1_conv1_filter0=W[lay][:,:,:,0]
print("size of block1_conv1_filter0 =",block1_conv1_filter0.shape)

fig,axes=plt.subplots(n,n,figsize=(15,15))
for i in range(n*n):
    filters=W[lay][:,:,0,i]#显示3个filter中的第一个
    axes[i//n,i%n].imshow(filters)

5，2 可视化block1_conv2的前64个卷积核

block1_conv2总共有64个卷积核，输入尺寸为224x224x64，每个卷积核的尺寸是3x3x64，输出的feature map是224x224x64。由于卷积核总共有64个通道，imshow函数无法显示这一维度的图像，因此，下面所显示的卷积核都是通道0的图像。

n=8
lay=2

block1_conv2_filter0=W[lay][:,:,:,0]
print("size of block1_conv2_filter0 =",block1_conv2_filter0.shape)

fig,axes=plt.subplots(n,n,figsize=(15,15))
for i in range(n*n):
    filters=W[lay][:,:,0,i]#显示64个filter中的第一个
    axes[i//n,i%n].imshow(filters)

5，3 可视化block2_conv1的前64个卷积核

block2_conv1总共有128个卷积核，输入尺寸为112x112x64，每个卷积核的尺寸是3x3x64，输出的feature map是112x112x128。由于卷积核总共有64个通道，imshow函数无法显示这一维度的图像，因此，下面所显示的卷积核都是通道0的图像。

n=8
lay=4

block2_conv1_filter0=W[lay][:,:,:,0]
print("size of block2_conv1_filter0 =",block2_conv1_filter0.shape)

fig,axes=plt.subplots(n,n,figsize=(15,15))
for i in range(n*n):
    filters=W[lay][:,:,0,i]#显示64个filter中的第一个
    axes[i//n,i%n].imshow(filters)

5，4 可视化block3_conv3的前64个卷积核

block3_conv3总共有256个卷积核，输入尺寸为56x56x256，每个卷积核的尺寸是3x3x256，输出的feature map是56x56x256。由于卷积核总共有256个通道，imshow函数无法显示这一维度的图像，因此，下面所显示的卷积核都是通道0的图像。

n=8
lay=12

block3_conv3_filter0=W[lay][:,:,:,0]
print("size of block3_conv3_filter0 =",block3_conv3_filter0.shape)

fig,axes=plt.subplots(n,n,figsize=(15,15))
for i in range(n*n):
    filters=W[lay][:,:,0,i]#显示256个filter中的第一个
    axes[i//n,i%n].imshow(filters)

5，5 可视化block4_conv1的前64个卷积核

block4_conv1总共有512个卷积核，输入尺寸为28x28x256，每个卷积核的尺寸是3x3x256，输出的feature map是28x28x512。由于卷积核总共有256个通道，imshow函数无法显示这一维度的图像，因此，下面所显示的卷积核都是通道0的图像。

n=8
lay=14

block4_conv1_filter0=W[lay][:,:,:,0]
print("size of block4_conv1_filter0 =",block4_conv1_filter0.shape)

fig,axes=plt.subplots(n,n,figsize=(15,15))
for i in range(n*n):
    filters=W[lay][:,:,0,i]#显示256个filter中的第一个
    axes[i//n,i%n].imshow(filters)

5，6 可视化block5_conv3的前64个卷积核

block5_conv3总共有512个卷积核，输入尺寸为14x14x512，每个卷积核的尺寸是3x3x512，输出的feature map是14x14x512。由于卷积核总共有512个通道，imshow函数无法显示这一维度的图像，因此，下面所显示的卷积核都是通道0的图像。

n=8
lay=24

block5_conv3_filter0=W[lay][:,:,:,0]
print("size of block5_conv3_filter0 =",block5_conv3_filter0.shape)

fig,axes=plt.subplots(n,n,figsize=(15,15))
for i in range(n*n):
    filters=W[lay][:,:,0,i]#显示512个filter中的第一个
    axes[i//n,i%n].imshow(filters)

6，全连接层的可视化

第一个全连接层fc1共有4096个神经元，每个神经元都与25088个输入相连。也就是说如果该全连接层的数学模型为f=xW+b的话，x的维度是1x25088，W的维度应当是25088x4096。

fc1=W[26]
fc1.shape

在fc1的4096组25088个权重中，用于计算第500个神经元的25088个权重系数为：

plt.figure()
plt.stem(fc1[:,500])

对fc1而言，共有4096组权重，每组权重对应一个神经元，且有多少个神经元就对应多少个偏置项。

fc1_bias=W[27]
print(fc1_bias.shape)
plt.figure()
plt.stem(fc1_bias)

7，softmax层的可视化

sfmx=W[-2]
sfmx.shape

plt.figure()
plt.stem(sfmx[:,500])

（全文完）

--- 作者，松下J27

参考文献(鸣谢)：

1，代码实战-VGG16卷积核权重可视化_哔哩哔哩_bilibili

2，Stanford University CS231n: Deep Learning for Computer Vision

3，可视化卷积神经网络_哔哩哔哩_bilibili

(配图与本文无关)

古诗词赏析：

《池州翠微亭》

宋---岳飞

经年尘土满征衣，特特寻芳上翠微。
好水好山看不足，马蹄催趁月明归。