政安晨：【Keras机器学习实践要点】（四）—

政安晨的个人主页：政安晨

欢迎 👍点赞✍评论⭐收藏

收录专栏: TensorFlow与Keras实战演绎机器学习

希望政安晨的博客能够对您有所裨益，如有不足之处，欢迎在评论区提出指正！

介绍

Keras是一个用于构建和训练深度学习模型的开源库。在Keras中，顺序模型是最简单的一种模型类型，它是一个线性堆叠的神经网络模型。顺序模型由一系列网络层按照顺序连接而成，每个网络层都可以包含多个神经元。

在构建顺序模型时，可以通过将网络层实例化并以列表的形式传递给Sequential类来实现。

顺序模型提供了许多其他的方法，可以用来配置网络层、编译模型、训练模型、评估模型等。它是一个简单而直观的方式来构建神经网络模型。

可以看一下本专栏的入门篇的文章搭建实验环境。

导入

from tensorflow import keras
from keras import layers
from keras import ops

何时使用顺序模型

顺序模型适用于每个层都有一个输入张量和一个输出张量的普通层堆。

下面是序列模型的示意图：

# Define Sequential model with 3 layers
model = keras.Sequential(
    [
        layers.Dense(2, activation="relu", name="layer1"),
        layers.Dense(3, activation="relu", name="layer2"),
        layers.Dense(4, name="layer3"),
    ]
)
# Call model on a test input
x = ops.ones((3, 3))
y = model(x)

等同于这个函数：

# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")

# Call layers on a test input
x = ops.ones((3, 3))
y = layer3(layer2(layer1(x)))

使用Sequential模型不适合的情况有：

× 模型有多个输入或多个输出

× 任何层有多个输入或多个输出

× 需要进行层共享

× 想要非线性的拓扑结构（例如残差连接、多分支模型）

创建一个Sequential（顺序）模型

你可以通过将一系列层传递给Sequential构造函数来创建一个Sequential模型。

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)

它的层可以通过layers属性进行访问：

model.layers

您还可以通过 add() 方法逐步创建序列模型：

model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))

请注意，还有一个相应的pop()方法可以删除层：Sequential模型的行为非常类似于层的列表。

model.pop()
print(len(model.layers))  # 2

还要注意的是，Sequential构造函数接受一个name参数，就像Keras中的任何层或模型一样。

这对于用语义上有意义的名称为TensorBoard图形进行注释非常有用。

model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))

提前指定输入形状

一般来说，Keras 中的所有层都需要知道其输入的形状，以便创建权重。因此，当你创建一个这样的层时，它最初是没有权重的：

layer = layers.Dense(3)
layer.weights  # Empty

由于权重的形状取决于输入的形状，因此它在首次调用输入时就会创建权重：

# Call layer on a test input
x = ops.ones((1, 4))
y = layer(x)
layer.weights  # Now it has weights, of shape (4, 3) and (3,)

演绎：

当然，这也适用于序列模型。当您实例化一个没有输入形状的序列模型时，它并没有 "建立"：它没有权重（调用 model.weights 会导致错误说明）。

权重是在模型第一次看到输入数据时创建的：

model = keras.Sequential(
    [
        layers.Dense(2, activation="relu"),
        layers.Dense(3, activation="relu"),
        layers.Dense(4),
    ]
)  # No weights at this stage!

# At this point, you can't do this:
# model.weights

# You also can't do this:
# model.summary()

# Call the model on a test input
x = ops.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights))  # 6

演绎：

一旦建立了一个模型，你可以调用它的summary()方法来显示其内容：

model.summary()

不过，在逐步建立序列模型时，显示模型的摘要（包括当前输出形状）可能会非常有用。

在这种情况下，您应该通过向模型传递一个输入对象来启动模型，以便模型从一开始就知道其输入形状：

model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))

model.summary()

请注意，输入对象不会显示为 model.layers 的一部分，因为它不是一个图层：

像这样使用预定义的输入形状构建的模型始终具有权重（即使在没有看到任何数据之前）并且始终具有定义的输出形状。

总的来说，如果您知道输入形状，最好事先明确指定Sequential模型的输入形状。

常见的调试工作流程：add( ) + summary()

在构建新的序列架构时，使用 add() 逐步堆叠图层并经常打印模型摘要非常有用。

例如，这样就可以监控 Conv2D 和 MaxPooling2D 图层堆栈如何对图像特征图进行下采样：

model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3)))  # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))

# Can you guess what the current output shape is at this point? Probably not.
# Let's just print it:
model.summary()

# The answer was: (40, 40, 32), so we can keep downsampling...

model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))

# And now?
model.summary()

# Now that we have 4x4 feature maps, time to apply global max pooling.
model.add(layers.GlobalMaxPooling2D())

# Finally, we add a classification layer.
model.add(layers.Dense(10))

在实战中实践真的很实用。

一旦你拥有一个模型，接下来应该做什么？

一旦你的模型架构准备好了，你可以进行以下操作：

训练你的模型，评估它并进行推断。请参考我们的训练和评估指南。
将你的模型保存到磁盘并恢复它。请参考我们的序列化和保存指南。

使用顺序模型提取特征

一旦建立了顺序模型，它就会像功能 API 模型一样运行。这意味着每一层都有输入和输出属性。这些属性可以用来做一些巧妙的事情，比如快速创建一个模型，提取顺序模型中所有中间层的输出：

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=[layer.output for layer in initial_model.layers],
)

# Call feature extractor on test input.
x = ops.ones((1, 250, 250, 3))
features = feature_extractor(x)

下面是一个类似的例子，只从一个图层中提取特征：

initial_model = keras.Sequential(
    [
        keras.Input(shape=(250, 250, 3)),
        layers.Conv2D(32, 5, strides=2, activation="relu"),
        layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
        layers.Conv2D(32, 3, activation="relu"),
    ]
)
feature_extractor = keras.Model(
    inputs=initial_model.inputs,
    outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = ops.ones((1, 250, 250, 3))
features = feature_extractor(x)

用顺序模型进行迁移学习

迁移学习的方法是冻结模型中的底层，并只训练顶层。如果你对此不熟悉，请确保阅读我们的迁移学习指南。

下面是两种使用Sequential模型的常见迁移学习蓝图。

首先，假设你有一个Sequential模型，你想要冻结除了最后一层之外的所有层。在这种情况下，你只需要遍历model.layers，并设置layer.trainable = False，除了最后一层外的每层都如此。

像这样：

model = keras.Sequential([
    keras.Input(shape=(784)),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(10),
])

# Presumably you would want to first load pre-trained weights.
model.load_weights(...)

# Freeze all layers except the last one.
for layer in model.layers[:-1]:
  layer.trainable = False

# Recompile and train (this will only update the weights of the last layer).
model.compile(...)
model.fit(...)

另一种常见的蓝图是使用Sequential模型来堆叠一个预训练模型和一些新初始化的分类层。就像这样：

# Load a convolutional base with pre-trained weights
base_model = keras.applications.Xception(
    weights='imagenet',
    include_top=False,
    pooling='avg')

# Freeze the base model
base_model.trainable = False

# Use a Sequential model to add a trainable classifier on top
model = keras.Sequential([
    base_model,
    layers.Dense(1000),
])

# Compile & train
model.compile(...)
model.fit(...)

如果你进行迁移学习，你可能会发现自己经常使用这两种模式。

这就是你需要了解的关于序列模式的全部内容！