Python 深度学习第8章计算机视觉中的深度学习

Python 深度学习第8章计算机视觉中的深度学习 - 卷积神经网络使用实例

news2025/4/22 6:24:40

Python 深度学习第8章计算机视觉中的深度学习 - 卷积神经网络使用实例

内容概要

第8章深入探讨了计算机视觉中的深度学习，特别是卷积神经网络（convnets）的应用。本章详细介绍了卷积层和池化层的工作原理、数据增强技术、预训练模型的特征提取和微调方法。通过本章，读者将掌握如何使用深度学习解决图像分类问题，尤其是在小数据集上的应用。
在这里插入图片描述

主要内容

卷积神经网络（Convnets）
- 卷积操作：学习局部模式，具有平移不变性。
- 池化操作：通过下采样减少特征图的尺寸，提取重要特征。
- 卷积神经网络的结构：由卷积层、池化层和全连接层组成。
数据增强
- 数据增强技术：通过随机变换生成更多训练数据，减少过拟合。
- Keras中的数据增强层：如RandomFlip、RandomRotation和RandomZoom。
预训练模型的使用
- 特征提取：使用预训练模型的卷积基提取特征，然后训练新的分类器。
- 微调：解冻预训练模型的顶部几层，与新添加的分类器一起训练。
在小数据集上训练卷积神经网络
- 数据准备：使用Keras的image_dataset_from_directory函数加载和预处理图像数据。
- 模型构建：构建包含卷积层和池化层的模型。
- 过拟合的应对：使用数据增强和Dropout层减少过拟合。

关键代码和算法

1.1 卷积神经网络示例

from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(inputs)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(10, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

1.2 数据增强

data_augmentation = keras.Sequential(
    [
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(0.1),
        layers.RandomZoom(0.2),
    ]
)

inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs)
x = layers.Rescaling(1./255)(x)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Flatten()(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

1.3 使用预训练模型进行特征提取

conv_base = keras.applications.vgg16.VGG16(
    weights="imagenet",
    include_top=False,
    input_shape=(180, 180, 3)
)

def get_features_and_labels(dataset):
    all_features = []
    all_labels = []
    for images, labels in dataset:
        preprocessed_images = keras.applications.vgg16.preprocess_input(images)
        features = conv_base.predict(preprocessed_images)
        all_features.append(features)
        all_labels.append(labels)
    return np.concatenate(all_features), np.concatenate(all_labels)

train_features, train_labels = get_features_and_labels(train_dataset)
val_features, val_labels = get_features_and_labels(validation_dataset)
test_features, test_labels = get_features_and_labels(test_dataset)

inputs = keras.Input(shape=(5, 5, 512))
x = layers.Flatten()(inputs)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)

1.4 微调预训练模型

conv_base = keras.applications.vgg16.VGG16(
    weights="imagenet",
    include_top=False
)
conv_base.trainable = True
for layer in conv_base.layers[:-4]:
    layer.trainable = False

data_augmentation = keras.Sequential(
    [
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(0.1),
        layers.RandomZoom(0.2),
    ]
)

inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs)
x = keras.applications.vgg16.preprocess_input(x)
x = conv_base(x)
x = layers.Flatten()(x)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)

model.compile(loss="binary_crossentropy",
              optimizer=keras.optimizers.RMSprop(learning_rate=1e-5),
              metrics=["accuracy"])

精彩语录

中文：卷积神经网络是计算机视觉任务中最佳的深度学习模型类型。
英文原文：Convnets are the best type of machine learning models for computer vision tasks.
解释：这句话强调了卷积神经网络在计算机视觉中的重要性。
中文：数据增强是减少过拟合的强大工具。
英文原文：Data augmentation is a powerful way to fight overfitting when you’re working with image data.
解释：这句话总结了数据增强在图像数据中的关键作用。
中文：通过特征提取，可以轻松地在新数据集上重用现有的卷积神经网络。
英文原文：It’s easy to reuse an existing convnet on a new dataset via feature extraction.
解释：这句话介绍了特征提取在小数据集上的应用。
中文：微调可以进一步提升性能。
英文原文：As a complement to feature extraction, you can use fine-tuning, which adapts to a new problem some of the representations previously learned by an existing model.
解释：这句话解释了微调如何改进模型性能。
中文：深度学习在小数据集上的表现令人印象深刻。
英文原文：There is a huge difference between being able to train on 20,000 samples compared to 2,000 samples!
解释：这句话强调了深度学习在小数据集上的潜力。