- 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
- 🍖 原作者:K同学啊
前言
- 本次还是学习
API
和如何搭建神经网络为主,这一次用VGG16去对猫狗分类,效果还是很好的,达到了99.8%+
文章目录
- 1、tqdm、train_on_batch、预测简介
- tqdm
- train_on_bacth
- `train_on_batch` 方法签名
- 预测需要注意
- 2、猫狗识别实现
- 1、数据处理
- 1、导入库
- 2、查看数据数量与类别
- 3、导入数据
- 4、数据图片展示
- 5、数据归一化
- 6、设置内存加速
- 2、模型构建
- 3、模型训练
- 1、超参数设置
- 2、模型训练
- 4、结果显示
- 5、预测
1、tqdm、train_on_batch、预测简介
tqdm
这个是一个修饰的API
,它展现的是进度条形式,用于显示训练进度,如下:
tqdm(total=train_total, desc=f'Epoch {epoch + 1} / {epochs}', mininterval=1, ncols=100)
total
:预期迭代数目mininterval
:进度条更新的速度ncols
:控制条宽度
train_on_bacth
本人首先学的是pytorch
,在pytorch
中训练模型,比较灵活,需要自己去计算损失函数、去模拟训练过程D等, 但是当我在学tensorflow
的时候,发现在tensorflow
中,有一个叫做model.fit()的函数,封装的很完善,但是这样对于本人先学pytorch
的来说,感觉还是pytorch
好用,==而今天这次案例用到了train_on_bacth==
,这个API
也像pytorch
一样,提供了更多的灵活性,本人更偏爱这种方法,但是还是本人更喜欢pytorch
🤠🤠🤠🤠
train_on_batch(self, x, y=None, sample_weight=None, class_weight=None, reset_metrics=True, return_dict=False)
当然!train_on_batch
方法有几个重要的参数,下面是对这些参数的详细解释:
train_on_batch
方法签名
train_on_batch(self, x, y=None, sample_weight=None, class_weight=None, reset_metrics=True, return_dict=False)
x
: 输入数据。- 类型:可以是 Numpy 数组、列表、字典,或者任何其他类型的数据,具体取决于模型的输入层。
- 描述:这是模型的输入数据,通常是一个批次的数据。
y
: 目标数据(标签)。- 类型:Numpy 数组、列表、字典等。
- 描述:这是模型的输出目标,即你希望模型预测的标签。如果模型是无监督的,这个参数可以省略。
sample_weight
: 样本权重。- 类型:Numpy 数组。
- 描述:这是一个与
x
中每个样本相对应的权重数组,用于在计算损失时给不同样本分配不同的权重。形状应与y
的第一个维度相同。
class_weight
: 类别权重。- 类型:字典。
- 描述:这是一个字典,键是类别索引(整数),值是对应的权重。用于在计算损失时给不同类别分配不同的权重。这对于不平衡数据集特别有用。
reset_metrics
: 是否在每次调用后重置模型的指标。- 类型:布尔值。
- 描述:如果设置为
True
,则在每次调用train_on_batch
后,模型的指标(如准确率)将被重置。如果设置为False
,指标将在多次调用之间累积。
return_dict
: 是否以字典形式返回结果。- 类型:布尔值。
- 描述:如果设置为
True
,则返回一个包含损失和所有指标的字典。如果设置为False
,则返回一个列表,其中第一个元素是损失值,后续元素是各个指标的值。
核心:👀👀👀👀👀👀 关注x,y 即可
预测需要注意
以这个案例为示:
for images, labels in val_ds.take(1):
for i in range(10):
plt.subplot(1, 10, i + 1)
plt.imshow(images[i].numpy())
# 增加一个维度
img_array = tf.expand_dims(images[i], 0)
# 预测
predictions = model.predict(img_array)
plt.title(image_classnames[np.argmax(predictions)])
plt.axis("off")
注意: 增加一个维度,因为在模型输入默认有一个批次层,这个需要注意的。
2、猫狗识别实现
1、数据处理
1、导入库
import tensorflow as tf
from tensorflow.keras import layers, models, datasets
import numpy as np
gpus = tf.config.list_physical_devices("GPU")
if gpus:
gpu0 = gpus[0]
tf.config.experimental.set_memory_growth(gpu0, True) # 输出存储在GPU
tf.config.set_visible_devices([gpu0], "GPU") # 选择第一块GPU
gpus
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2、查看数据数量与类别
import os, pathlib
# 查看数据数量
data_dir = "./data/"
data_dir = pathlib.Path(data_dir)
images_path = data_dir.glob('*/*')
images_path_list = [str(path) for path in images_path]
images_num = len(images_path_list)
image_classnames = [names for names in os.listdir(data_dir)]
print("images have number: ", images_num)
print("images have classes: ", image_classnames)
images have number: 3400
images have classes: ['cat', 'dog']
3、导入数据
# 训练集 :验证集 = 8 :2
batch_size = 32
image_width = 224
image_height = 224
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
'./data/',
subset='training',
validation_split=0.2,
seed=42,
batch_size=batch_size,
shuffle=True,
image_size=(image_width, image_height)
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
'./data/',
validation_split=0.2,
subset='validation',
seed=42,
batch_size=batch_size,
shuffle=True,
image_size=(image_width, image_height)
)
Found 3400 files belonging to 2 classes.
Using 2720 files for training.
Found 3400 files belonging to 2 classes.
Using 680 files for validation.
展示数据格式
# 展示一批数据格式
batch_datas, data_labels = next(iter(train_ds))
print("[N, W, H, C]: ", batch_datas.shape)
print("data_classes: ", data_labels)
[N, W, H, C]: (32, 224, 224, 3)
data_classes: tf.Tensor([0 1 0 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 1 1 0 1 1 1 1 0 0 1 0], shape=(32,), dtype=int32)
4、数据图片展示
import matplotlib.pyplot as plt
plt.figure(figsize=(20, 10))
for i in range(20):
plt.subplot(5, 10, i + 1)
plt.imshow(batch_datas[i].numpy().astype("uint8"))
plt.title(image_classnames[data_labels[i]])
plt.axis('off')
plt.show()
5、数据归一化
数据存储格式:图片数据 + 标签
# 像素归一化, ---> [0, 1]
normalization_layer = layers.experimental.preprocessing.Rescaling(1.0 / 255)
# 训练集、测试集像素归一化
train_ds = train_ds.map(lambda x, y : (normalization_layer(x), y))
val_ds = val_ds.map(lambda x, y : (normalization_layer(x), y))
6、设置内存加速
from tensorflow.data.experimental import AUTOTUNE
AUTOTUNE = tf.data.experimental.AUTOTUNE
# 打乱顺序加速
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
2、模型构建
一共类别只有两类,故VGG16模型全连接层中,不需要展开那么多层,运用上次案例的模型即可。
def VGG16(class_num, input_shape):
inputs = layers.Input(input_shape)
# 1st block
x = layers.Conv2D(64, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(inputs)
x = layers.Conv2D(64, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)
# 2nd block
x = layers.Conv2D(128, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.Conv2D(128, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)
# 3rd block
x = layers.Conv2D(256, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.Conv2D(256, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.Conv2D(256, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)
# 4th block
x = layers.Conv2D(512, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.Conv2D(512, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.Conv2D(512, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)
# 5th block
x = layers.Conv2D(512, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.Conv2D(512, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.Conv2D(512, kernel_size=(3, 3), activation='relu', strides=(1, 1), padding='same')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)
# 全连接层, 这里修改以下
x = layers.Flatten()(x)
x = layers.Dense(4096, activation='relu')(x)
x = layers.Dense(4096, activation='relu')(x)
# 最后一层用激活函数:softmax
out_shape = layers.Dense(class_num, activation='softmax')(x)
# 创建模型
model = models.Model(inputs=inputs, outputs=out_shape)
return model
model = VGG16(len(image_classnames), (image_width, image_height, 3))
model.summary()
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0
conv2d (Conv2D) (None, 224, 224, 64) 1792
conv2d_1 (Conv2D) (None, 224, 224, 64) 36928
max_pooling2d (MaxPooling2D (None, 112, 112, 64) 0
)
conv2d_2 (Conv2D) (None, 112, 112, 128) 73856
conv2d_3 (Conv2D) (None, 112, 112, 128) 147584
max_pooling2d_1 (MaxPooling (None, 56, 56, 128) 0
2D)
conv2d_4 (Conv2D) (None, 56, 56, 256) 295168
conv2d_5 (Conv2D) (None, 56, 56, 256) 590080
conv2d_6 (Conv2D) (None, 56, 56, 256) 590080
max_pooling2d_2 (MaxPooling (None, 28, 28, 256) 0
2D)
conv2d_7 (Conv2D) (None, 28, 28, 512) 1180160
conv2d_8 (Conv2D) (None, 28, 28, 512) 2359808
conv2d_9 (Conv2D) (None, 28, 28, 512) 2359808
max_pooling2d_3 (MaxPooling (None, 14, 14, 512) 0
2D)
conv2d_10 (Conv2D) (None, 14, 14, 512) 2359808
conv2d_11 (Conv2D) (None, 14, 14, 512) 2359808
conv2d_12 (Conv2D) (None, 14, 14, 512) 2359808
max_pooling2d_4 (MaxPooling (None, 7, 7, 512) 0
2D)
flatten (Flatten) (None, 25088) 0
dense (Dense) (None, 4096) 102764544
dense_1 (Dense) (None, 4096) 16781312
dense_2 (Dense) (None, 2) 8194
=================================================================
Total params: 134,268,738
Trainable params: 134,268,738
Non-trainable params: 0
_________________________________________________________________
3、模型训练
1、超参数设置
model.compile(
optimizer = "adam",
loss = 'sparse_categorical_crossentropy',
metrics = ['accuracy']
)
2、模型训练
from tqdm import tqdm
import tensorflow.keras.backend as K
learn_rate = 1e-4
epochs = 10
history_train_loss = []
history_train_accuracy = []
history_val_loss = []
histoty_val_accuracy = []
for epoch in range(epochs):
train_total = len(train_ds)
val_total = len(val_ds)
with tqdm(total=train_total, desc=f'Epoch {epoch + 1} / {epochs}', mininterval=1, ncols=100) as pbar:
learn_rate = learn_rate * 0.92 # 动态加载学习率
K.set_value(model.optimizer.lr, learn_rate)
# 创建存储的损失率、准确率
train_loss, train_accuracy = 0, 0
batch_num = 0
for image, label in train_ds:
history = model.train_on_batch(image, label) # 核心: 模型训练
train_loss += history[0]
train_accuracy += history[1]
batch_num += 1
pbar.set_postfix({"loss": "%.4f"%train_loss,
"accuracy":"%.4f"%train_accuracy,
"lr": K.get_value(model.optimizer.lr)})
pbar.update(1)
# 记录平均损失值、准确率
history_train_loss.append(train_loss / batch_num)
history_train_accuracy.append(train_accuracy / batch_num)
print("开始验证!!")
with tqdm(total=val_total, desc=f'Epoch {epoch + 1} / {epochs}', mininterval=0.3, ncols=100) as pbar:
val_loss, val_accuracy = 0, 0
batch_num = 0
for image, label in val_ds:
history = model.train_on_batch(image, label) # 核心: 训练
val_loss += history[0]
val_accuracy += history[1]
batch_num += 1 # 记录训练批次
pbar.set_postfix({"loss": "%.4f"%val_loss,
"accuracy":"%.4f"%val_accuracy})
pbar.update(1)
# 记录 平均 损失值和准确率
history_val_loss.append(val_loss / batch_num)
histoty_val_accuracy.append(val_accuracy / batch_num)
print('结束验证!')
print("平均验证loss为:%.4f"%(val_loss / batch_num))
print("平均验证准确率为:%.4f"%(val_accuracy / batch_num))
Epoch 1 / 10: 0%| | 0/85 [00:00<?, ?it/s]2024-11-15 17:02:53.513057: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8101
2024-11-15 17:02:56.690008: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
Epoch 1 / 10: 100%|██████| 85/85 [00:21<00:00, 3.98it/s, loss=51.2915, accuracy=55.0000, lr=9.2e-5]
开始验证!!
Epoch 1 / 10: 100%|██████████████████| 22/22 [00:05<00:00, 4.39it/s, loss=5.7842, accuracy=19.8125]
结束验证!
平均验证loss为:0.2629
平均验证准确率为:0.9006
Epoch 2 / 10: 100%|█████| 85/85 [00:13<00:00, 6.32it/s, loss=13.8654, accuracy=80.2500, lr=8.46e-5]
开始验证!!
Epoch 2 / 10: 100%|██████████████████| 22/22 [00:03<00:00, 6.17it/s, loss=1.9731, accuracy=21.3125]
结束验证!
平均验证loss为:0.0897
平均验证准确率为:0.9688
Epoch 3 / 10: 100%|██████| 85/85 [00:13<00:00, 6.29it/s, loss=6.9991, accuracy=82.6562, lr=7.79e-5]
开始验证!!
Epoch 3 / 10: 100%|██████████████████| 22/22 [00:03<00:00, 6.54it/s, loss=1.1199, accuracy=21.5938]
结束验证!
平均验证loss为:0.0509
平均验证准确率为:0.9815
Epoch 4 / 10: 100%|██████| 85/85 [00:13<00:00, 6.35it/s, loss=4.5513, accuracy=83.4688, lr=7.16e-5]
开始验证!!
Epoch 4 / 10: 100%|██████████████████| 22/22 [00:03<00:00, 6.66it/s, loss=0.7666, accuracy=21.7812]
结束验证!
平均验证loss为:0.0348
平均验证准确率为:0.9901
Epoch 5 / 10: 100%|██████| 85/85 [00:13<00:00, 6.49it/s, loss=4.4772, accuracy=83.7188, lr=6.59e-5]
开始验证!!
Epoch 5 / 10: 100%|██████████████████| 22/22 [00:03<00:00, 6.61it/s, loss=0.5379, accuracy=21.8125]
结束验证!
平均验证loss为:0.0245
平均验证准确率为:0.9915
Epoch 6 / 10: 100%|██████| 85/85 [00:13<00:00, 6.39it/s, loss=1.5206, accuracy=84.4375, lr=6.06e-5]
开始验证!!
Epoch 6 / 10: 100%|██████████████████| 22/22 [00:03<00:00, 6.73it/s, loss=0.2960, accuracy=21.9062]
结束验证!
平均验证loss为:0.0135
平均验证准确率为:0.9957
Epoch 7 / 10: 100%|██████| 85/85 [00:13<00:00, 6.33it/s, loss=1.9587, accuracy=84.4062, lr=5.58e-5]
开始验证!!
Epoch 7 / 10: 100%|██████████████████| 22/22 [00:03<00:00, 6.31it/s, loss=0.2803, accuracy=21.9375]
结束验证!
平均验证loss为:0.0127
平均验证准确率为:0.9972
Epoch 8 / 10: 100%|██████| 85/85 [00:13<00:00, 6.43it/s, loss=0.6910, accuracy=84.7812, lr=5.13e-5]
开始验证!!
Epoch 8 / 10: 100%|██████████████████| 22/22 [00:03<00:00, 6.47it/s, loss=0.1591, accuracy=21.9375]
结束验证!
平均验证loss为:0.0072
平均验证准确率为:0.9972
Epoch 9 / 10: 100%|██████| 85/85 [00:13<00:00, 6.51it/s, loss=0.6709, accuracy=84.7812, lr=4.72e-5]
开始验证!!
Epoch 9 / 10: 100%|██████████████████| 22/22 [00:03<00:00, 6.64it/s, loss=0.1658, accuracy=21.9688]
结束验证!
平均验证loss为:0.0075
平均验证准确率为:0.9986
Epoch 10 / 10: 100%|█████| 85/85 [00:13<00:00, 6.43it/s, loss=0.5763, accuracy=84.7500, lr=4.34e-5]
开始验证!!
Epoch 10 / 10: 100%|█████████████████| 22/22 [00:03<00:00, 6.44it/s, loss=0.1906, accuracy=21.8750]
结束验证!
平均验证loss为:0.0087
平均验证准确率为:0.9943
4、结果显示
epochs_range = range(epochs)
plt.figure(figsize=(12, 4))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, history_train_accuracy, label='Training Accuracy')
plt.plot(epochs_range, histoty_val_accuracy, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, history_train_loss, label='Training Loss')
plt.plot(epochs_range, history_val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
5、预测
# 随机预测几张
plt.figure(figsize=(18 ,3))
for images, labels in val_ds.take(1):
for i in range(10):
plt.subplot(1, 10, i + 1)
plt.imshow(images[i].numpy())
# 增加一个维度
img_array = tf.expand_dims(images[i], 0)
# 预测
predictions = model.predict(img_array)
plt.title(image_classnames[np.argmax(predictions)])
plt.axis("off")
plt.show()
1/1 [==============================] - 0s 29ms/step
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 34ms/step
1/1 [==============================] - 0s 29ms/step
1/1 [==============================] - 0s 29ms/step
1/1 [==============================] - 0s 32ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 33ms/step
1/1 [==============================] - 0s 29ms/step
1/1 [==============================] - 0s 29ms/step