- 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
- 🍖 原作者:K同学啊
文章目录
- 一、前期工作
- 二、模型复现
- 1、设置GPU
- 2、导入数据
- 3、加载数据
- 4. 配置数据集
- 5. 可视化数据
- 6、构建DenseNet121网络
- 7、编译
- 8、训练模型
- 9、模型评估
- 三、总结
电脑环境:
语言环境:Python 3.8.0
编译器:Jupyter Notebook
深度学习环境:tensorflow 2.17.0
一、前期工作
DenseNet(稠密连接网络)是由Cornell大学的Gao Huang等人于2017年提出的深度学习网络架构。它的设计灵感来自于ResNet(残差网络)以及其前身 Highway Networks 的思想。
在深度卷积神经网络中,通常存在梯度消失或梯度爆炸等问题,尤其是随着网络层数的增加,残差网络引入了残差连接(跨层连接)来解决了这些问题,从而允许网络更深地训练,但是,ResNet中的跨层连接是通过相加的方式实现的,这意味着每一层只能直接访问前一层的输出。
DenseNet(密集卷积网络)的核心思想(创新之处)是密集连接,使得每一层都与所有之前的层直接连接,即某层的输入除了包含前一层的输出外还包含前面所有层的输出。
整个网络包含三个核心结构,DenseLayer、DenseBlock和Transition,通过上述的三个核心的结构的拼接加上其他层来完成整个模型的搭建。
如上图,共有5个DenseBlock,每个DenseBlock中有多个DenseLayer组成,每个DenseLayer层包含BN + Relu + 1x1Conv + BN + Relu + 3x3Conv,Transition模块包含BN + Relu + 1x1Conv + 2x2AvgPool。
二、模型复现
1、设置GPU
import tensorflow as tf
gpus = tf.config.list_physical_devices('GPU')
if gpus:
tf.config.experimental.set_memory_growth(gpus[0], True)
tf.config.set_visible_devices(gpus[0], 'GPU')
gpus
2、导入数据
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
import os, PIL, pathlib
import numpy as np
from tensorflow import keras
from keras import layers, models
data_dir = './bird_photos'
data_dir = pathlib.Path(data_dir)
image_count = len(list(data_dir.glob('*/*')))
image_count
3、加载数据
batch_size = 16
img_height = 224
img_width = 224
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=12,
image_size=(img_height, img_width),
batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=12,
image_size=(img_height, img_width),
batch_size=batch_size)
我们可以通过class_names输出数据集的标签。标签将按字母顺序对应于目录名称。
class_names = train_ds.class_names
print(class_names)
输出:
[‘Bananaquit’, ‘Black Skimmer’, ‘Black Throated Bushtiti’, ‘Cockatoo’]
4. 配置数据集
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
5. 可视化数据
plt.figure(figsize=(10, 4))
for images, labels in train_ds.take(1):
for i in range(8):
ax = plt.subplot(2, 4, i+1)
plt.imshow(images[i].numpy().astype('uint8'))
plt.title(class_names[labels[i]])
plt.axis('off')
6、构建DenseNet121网络
from keras import layers
def _DenseLayer(x, growth_rate, bn_size, drop_rate):
y = layers.BatchNormalization()(x)
y = layers.ReLU()(x)
y = layers.Conv2D(bn_size * growth_rate, kernel_size=1, use_bias=False)(x)
y = layers.BatchNormalization()(y)
y = layers.ReLU()(y)
y = layers.Conv2D(growth_rate, kernel_size=3, padding='same', use_bias=False)(y)
if drop_rate > 0:
y = layers.Dropout(drop_rate)(y)
return layers.concatenate([x, y])
def _DenseBlock(x, num_layers, growth_rate, bn_size, drop_rate):
for i in range(num_layers):
x = _DenseLayer(x, growth_rate, bn_size, drop_rate)
return x
def _Transition(x, reduction):
x = layers.BatchNormalization()(x)
x = layers.ReLU()(x)
x = layers.Conv2D(int(x.shape[-1] * reduction), kernel_size=1, use_bias=False)(x)
x = layers.AveragePooling2D(pool_size=2, strides=2)(x)
return x
def DenseNet121(input_shape=(224, 224, 3), growth_rate=32, num_init_features=64, drop_rate=0.0, num_classes=1000):
img_input = layers.Input(shape=input_shape)
x = layers.Conv2D(num_init_features, kernel_size=7, strides=2, padding='same')(img_input)
x = layers.BatchNormalization()(x)
x = layers.ReLU()(x)
x = layers.MaxPooling2D((3, 3), strides=2, padding='same')(x)
x = _DenseBlock(x, num_layers=6, growth_rate=growth_rate, bn_size=4, drop_rate=drop_rate)
x = _Transition(x, reduction=0.5)
x = _DenseBlock(x, num_layers=12, growth_rate=growth_rate, bn_size=4, drop_rate=drop_rate)
x = _Transition(x, reduction=0.5)
x = _DenseBlock(x, num_layers=24, growth_rate=growth_rate, bn_size=4, drop_rate=drop_rate)
x = _Transition(x, reduction=0.5)
x = _DenseBlock(x, num_layers=16, growth_rate=growth_rate, bn_size=4, drop_rate=drop_rate)
x = layers.BatchNormalization()(x)
x = layers.ReLU()(x)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(num_classes, activation='softmax')(x)
model = keras.Model(inputs=img_input, outputs=x)
return model
model = DenseNet121(input_shape=(224, 224, 3), num_classes=1000)
model.summary()
7、编译
opt = tf.keras.optimizers.Adam(learning_rate=1e-4)
model.compile(optimizer=opt,
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
8、训练模型
epochs = 20
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
Epoch 1/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 111s 2s/step - accuracy: 0.3597 - loss: 5.3345 - val_accuracy: 0.3628 - val_loss: 7.0990
Epoch 2/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 15s 164ms/step - accuracy: 0.7643 - loss: 1.6243 - val_accuracy: 0.4513 - val_loss: 5.3195
Epoch 3/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 5s 150ms/step - accuracy: 0.8274 - loss: 0.8932 - val_accuracy: 0.2655 - val_loss: 9.8976
Epoch 4/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 5s 152ms/step - accuracy: 0.8788 - loss: 0.4807 - val_accuracy: 0.4071 - val_loss: 4.5555
Epoch 5/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 5s 159ms/step - accuracy: 0.9197 - loss: 0.2717 - val_accuracy: 0.4690 - val_loss: 2.9667
Epoch 6/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 5s 163ms/step - accuracy: 0.9485 - loss: 0.2375 - val_accuracy: 0.7168 - val_loss: 1.2835
Epoch 7/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 5s 150ms/step - accuracy: 0.9724 - loss: 0.1271 - val_accuracy: 0.7168 - val_loss: 1.0972
Epoch 8/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 153ms/step - accuracy: 0.9792 - loss: 0.1039 - val_accuracy: 0.7080 - val_loss: 1.1467
Epoch 9/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 5s 148ms/step - accuracy: 0.9972 - loss: 0.0524 - val_accuracy: 0.8938 - val_loss: 0.4605
Epoch 10/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 5s 161ms/step - accuracy: 1.0000 - loss: 0.0273 - val_accuracy: 0.8584 - val_loss: 0.4196
Epoch 11/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 148ms/step - accuracy: 1.0000 - loss: 0.0158 - val_accuracy: 0.9027 - val_loss: 0.3243
Epoch 12/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 146ms/step - accuracy: 1.0000 - loss: 0.0092 - val_accuracy: 0.9027 - val_loss: 0.3552
Epoch 13/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 5s 151ms/step - accuracy: 1.0000 - loss: 0.0106 - val_accuracy: 0.9204 - val_loss: 0.3046
Epoch 14/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 146ms/step - accuracy: 1.0000 - loss: 0.0067 - val_accuracy: 0.9027 - val_loss: 0.3239
Epoch 15/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 146ms/step - accuracy: 1.0000 - loss: 0.0066 - val_accuracy: 0.9027 - val_loss: 0.3064
Epoch 16/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 6s 162ms/step - accuracy: 1.0000 - loss: 0.0053 - val_accuracy: 0.9027 - val_loss: 0.2987
Epoch 17/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 146ms/step - accuracy: 1.0000 - loss: 0.0045 - val_accuracy: 0.9027 - val_loss: 0.3208
Epoch 18/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 148ms/step - accuracy: 1.0000 - loss: 0.0046 - val_accuracy: 0.9027 - val_loss: 0.3119
Epoch 19/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 152ms/step - accuracy: 1.0000 - loss: 0.0062 - val_accuracy: 0.9027 - val_loss: 0.3093
Epoch 20/20
29/29 ━━━━━━━━━━━━━━━━━━━━ 4s 148ms/step - accuracy: 1.0000 - loss: 0.0050 - val_accuracy: 0.9027 - val_loss: 0.3102
9、模型评估
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
三、总结
ResNet引入了残差模块,每个模块中都有一个恒等映射(identity mapping),即输入 x 加到输出中:F(x) + x,其中 F(x) 是卷积后的结果。这样,网络可以通过学习残差函数来加速收敛,而不是直接学习完整的映射;
而DenseNet 的核心思想是密集连接,DenseNet 中每一层与之前所有层的输出相连。即对于每一层,它都接收之前所有层的特征图作为输入,并将自己的输出传递给后面的所有层。这种结构保证了特征重用,并减少了网络参数。