Keras 核心层

2024年8月28日 | 阅读 11 分钟

Dense

keras.layers.Dense(units, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

Dense 层可以定义为密集连接的常见神经网络层。Dense 层执行 output = activation(dot(input, kernel) +bias) 操作。这里，activation 是一个逐元素执行的激活函数，activation 参数用于传递激活函数，kernel 是层构建的权重矩阵，bias 是层创建的向量，仅当 use_bias 为 True 时才适用。

需要注意的是，如果输入到层的秩大于二，则在与 kernel 进行点积之前会将其展平。

示例

# First layer in the sequential model:
model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# The model takes the input as arrays of shape (*, 16) and output arrays of shape (*, 32)
# After the first layer, you don't need to specify the size of the input:
model.add(Dense(32))

参数

units： 指的是一个正整数，表示输出空间的维度。
activation： 确保 Dense 层使用逐元素激活函数。它是一个线性激活，默认设置为 None。由于其线性度的限制，我们没有太多内置的激活函数。
use_bias： 这是一个可选参数，意味着我们可以选择是否将其包含在计算中。它表示一个布尔值，指示层是否使用偏置向量。
kernel_initializer: 它可以定义为 kernel 权重矩阵的初始化器。
bias_initializer： 可以定义为偏置向量的初始化器，Keras 默认使用零初始化器。它假定将偏置向量设置为全零。
kernel_regularizer： 可以称为正则化函数，它应用于 kernel 权重矩阵。
bias_regularizer： 可以定义为应用于偏置向量的正则化函数。
activity_regualrizer： 与正则化函数相关，该函数应用于层的输出（其激活）。
kernel_constraint： 指应用于 kernel 权重矩阵的约束。
bias_constraint： 可以定义为应用于偏置向量的约束。

输入形状

输入形状层接受一个形状为 (batch_size, …, input_dim) 的 nD 张量，并确保其最常见的情况是包含形状为 (batch_size, input_dim) 的 2D 输入。

输出形状

它输出一个形状为 (batch_size, …, units) 的 nD 张量。例如，当 input 是形状为 (batch_size, input_dim) 的 2D 张量时，相应的 output 将是形状为 (batch_size, units) 的张量。

激活

这是在输出上实现激活函数的层。

参数

activation： 基本上，它指的是要使用的激活函数的名称，或者我们可以简单地说它是一个 Theano 或 TensorFlow 操作。

输入形状

它包含任意输入形状。在使用它作为模型中的初始层时，它会使用一个名为 input_shape 的参数。input_shape 可以定义为不包含样本轴的整数元组。

输出形状

输出形状与输入形状相同。

Dropout

dropout 应用于输入，因为它通过在训练期间的每次更新时以一定比例随机将单元设置为 0 来防止过拟合。

参数

rate： 指的是介于 0 和 1 之间的浮点值，表示要丢弃的单元的比例。
noise_shape： 指的是一个一维张量整数，它代表二元 dropout 掩码的形状，该掩码将用于与输入相乘。如果输入形状是 (batch_size, timesteps, features)，并且对于所有时间步，您希望 dropout 掩码相似，那么在这种情况下，可以使用 noise_shape=(batch_size, 1, feature)。
seed： 指示将用作随机种子的 Python 整数。

Flatten

Flatten 层用于展平输入，而不影响批次大小。

参数

data_format： 可以定义为字符串，值为 channels_last（默认）或 channels_first。它主要用于对输入维度进行排序，以便在模型从一种数据格式切换到另一种数据格式时保持权重排序。这里的 channels_last 对应于形状为 (batch, …, channels) 的输入形状，而 channels_first 对应于形状为 (batch, channels, …) 的输入形状。默认情况下，Keras 配置文件中的 image_data_format 值位于 ~/.keras/keras.json。如果尚未设置，则为 "channels_last"。

示例

model = Sequential()
model.add(Conv2D(64, (3, 3),
                 input_shape=(3, 32, 32), padding='same',))
# Now: model.output_shape == (None, 64, 32, 32)

model.add(Flatten())
# Now: model.output_shape == (None, 65536)

输入

Input 层使用 Input() 来实例化一个 Keras 张量，它只是一个后端（如 Theano、TensorFlow 或 CNTK）的张量对象。它可以添加一些特定属性，这些属性将允许我们仅使用输入和输出来构建 Keras 模型。

如果我们有 m、n 和 o 个 Keras 张量，那么我们可以执行 model = Model(input=[m, n], output=o)。

其他添加的 Keras 属性有：_keras_shape，通过 Keras 端形状推理传播的整数形状元组；以及 _keras_history，这是应用于张量的最后一个层。最后一个层使得能够递归地检索整个层图。

参数

shape： 形状元组可以定义为不包含批次大小的整数。例如，shape=(32, ) 指定预期的输入批次将是 32 维向量。
batch_shape： 形状元组指示一个包含批次大小的整数，例如，batch_shape=(10, 32) 表示预期的输入批次将是十个 32 维向量，而 batch_shape=(None, 32) 表示任意数量的 32 维向量的批次。
name： 层的可选字符串名称，必须是唯一的，即使未提供，也会自动生成。
dtype： 输入的预期数据类型是字符串 (float32, float64, int32, …)。
sparse： 指示一个布尔值，指定创建的占位符是否是稀疏的。
tensor： 这是一个可选的张量，用于包装到 Input 层中。如果设置了它，层将不会创建占位符张量。

返回值

它返回一个张量。

示例

# Logistic regression in Keras
x = Input(shape=(32,))
y = Dense(16, activation='softmax')(x)
model = Model(x, y)

Reshape

它用于将输出重塑为特定形状。

参数

target_shape： 指的是一个整数元组，指向输出形状，不包括批次轴。

输入形状

它包含任意输入形状，即使它是固定的，并且在使用此层作为模型中的初始层时使用 input_shape 参数。

输出形状

示例

# First layer in a Sequential model
model = Sequential()
model.add(Reshape((3, 4), input_shape=(12,)))
# Now: model.output_shape == (None, 3, 4)
# Note: Here `None` represents the batch dimension

# An intermediate layer in a Sequential model
model.add(Reshape((6, 2)))
# Now: model.output_shape == (None, 6, 2)

# It also supports shape inference using `-1` as a dimension
model.add(Reshape((-1, 2, 2)))
# Now: model.output_shape == (None, 3, 2, 2)

Permute

它根据给定的模式置换输入的维度，主要用于将 RNN 与 convnets 连接起来。

示例

model = Sequential()
model.add(Permute((2, 1), input_shape=(10, 64)))
# now: model.output_shape == (None, 64, 10)
# note: `None` is the batch dimension

参数

dims： 可以定义为整数元组。置换模式不包含样本维度。这里的索引从 1 开始，对于任何随机实例，(2,1) 将置换输入的第一个和第二个维度。

输入形状

它包含任意输入形状，并使用 input_shape 关键字参数，这是一个整数元组。在使用此层作为模型中的初始层时，将使用此参数。它不包含样本轴。

输出形状

输出形状与输入形状相似，只是维度根据某些特定模式进行了重新排序。

RepeatVector

RepeatVector 层用于将输入重复 n 次。

示例

model = Sequential()
model.add(Dense(32, input_dim=32))
# now: model.output_shape == (None, 32)
# note: `None` is the batch dimension

model.add(RepeatVector(3))
# now: model.output_shape == (None, 3, 32)

参数

n：可以定义为一个整数，表示重复因子。

输入形状

它包含形状为 (num_samples, features) 的 2D 张量。

输出形状

它构成一个形状为 (num_samples, n, features) 的 3D 张量。

Lambda

此层用于包装任意表达式，如 Layer 对象。

示例

# Adding a x -> x^2 layer
model.add(Lambda(lambda x: x ** 2))

# Now add a layer that will return the concatenation of the positive part of the input and the opposite of the negative part
def antirectifier(x):
    x -= K.mean(x, axis=1, keepdims=True)
    x = K.l2_normalize(x, axis=1)
    pos = K.relu(x)
    neg = K.relu(-x)
    return K.concatenate([pos, neg], axis=1)

def antirectifier_output_shape(input_shape):
    shape = list(input_shape)
    assert len(shape) == 2  # only valid for 2D tensors
    shape[-1] *= 2
    return tuple(shape)

model.add(Lambda(antirectifier,
                 output_shape=antirectifier_output_shape))

# Now add a layer that will return the hadamard product and its sum from two input tensors.
def hadamard_product_sum(tensors):
    out1 = tensors[0] * tensors[1]
    out2 = K.sum(out1, axis=-1)
    return [out1, out2]

def hadamard_product_sum_output_shape(input_shapes):
    shape1 = list(input_shapes[0])
    shape2 = list(input_shapes[1])
    assert shape1 == shape2  # else hadamard product isn't possible
    return [tuple(shape1), tuple(shape2[:-1])]

x1 = Dense(32)(input_1)
x2 = Dense(32)(input_2)
layer = Lambda(hadamard_product_sum, hadamard_product_sum_output_shape)
x_hadamard, x_sum = layer([x1, x2])

参数

function： 可以定义为需要计算的函数。它以输入张量或张量列表作为第一个参数。
output_shape： 它期望函数本身的输出形状，这对于使用 Theano 来说很重要。如果 output_shape 是一个元组，那么它从第一个维度开始指定。它假定样本维度与 output_shape = (input_shape[0], ) + output_shape 相似，或者输入是 None。类似地，如果函数被指定为相对于输入形状的整个形状，则维度为 None: output_shape = (None, ) + output_shape：output_shape = f(input_shape)。
mask： 它可以是 None，表示不进行掩码；也可以是一个张量，它与嵌入的输入掩码相关。
arguments： 这是一个可选的关键字参数字典，将传递给函数。

输入形状

输入形状是一个任意整数元组，在使用此层作为模型中的初始层时使用 input_shape 参数，并且不包含样本轴。

输出形状

它要么由 output_shape 参数指定，要么在使用 TensorFlow 或 CNTK 时自动推断。

ActivityRegularization

ActivityRegularization 层根据输入活动更新成本函数。

参数

l1： L1 是一个正浮点正则化因子。
l2： L2 是一个正浮点正则化因子。

输入形状

它是一个任意整数元组，在使用此层作为模型中的初始层时使用 input_shape 参数。它不包含样本轴。

输出形状

输出形状与输入形状相似。

Masking

Masking 层用于掩码序列，只需使用一个掩码值来避免时间步。对于给定的时间步样本，如果所有特征都等于 mask_value，那么在这种情况下，样本时间步将被掩码（跳过）在所有下游层中，前提是它们支持掩码。

如果下游层不支持掩码但仍接收输入掩码，则会引发异常。

示例

设 x 是形状为 (samples, timesteps, features) 的 numpy 数据数组，它将被馈送到 LSTM 层。现在假设您希望在时间步 #3 掩码 #0，并在时间步 #5 掩码 #2，因为您缺少这些样本时间步的特征，那么您可以执行以下操作：

将 x[0, 3, :]=0 和 x[2, 5, :]=0。
要在 LSTM 层之前插入一个 masking 层，请使用 mask_value=0。

model = Sequential()
model.add(Masking(mask_value=0., input_shape=(timesteps, features)))
model.add(LSTM(32))

参数

mask_value： 它可以是 None 或 skipped。

SpatialDropout1D

这是 SpatialDropout1D 版本，它执行与 dropout 相同的功能，但它不丢弃单个元素，而是丢弃整个 1D 特征图。当特征图中的连续帧强相关（就像在卷积层中那样）时，在这种情况下，激活不会被常规 dropout 正则化，而是会降低有效学习率。在这种特定情况下，它有助于促进特征图之间的独立性，并将其用于替代。

参数

rate： 介于 0 和 1 之间的浮点数。要丢弃的输入单元的比例。

输入形状

它是一个形状为 (samples, timesteps, channels) 的 3D 张量。

输出形状

输出形状与输入形状相似。

SpatialDropout2D

这是 SpatialDropout2D 版本。它也执行与 dropout 类似的功能；然而，它丢弃整个 2D 特征图而不是单个元素。如果特征图中的相邻帧强相关（就像在卷积层中所做的那样），那么激活不会被常规 dropout 正则化，否则会降低有效学习率。在这种情况下，它促进了特征图之间的独立性，并以此代替。

参数

rate： 介于 0 和 1 之间的浮点数。要丢弃的输入单元的比例。
data_format： 模式为 'channels_first' 或 'channels_last'。如果处于 channels_first 模式，则深度位于索引 1；否则，在 channels_last 的情况下，它位于索引 3。它默认为 Keras 配置文件 ~/.keras/keras.json 中的 image_data_format 值。如果在该文件夹中找不到，则为 "channels_last"。

输入形状

如果 data_format='channels_first'，则 4D 张量的形状为 (samples, channels, rows, cols)，否则，如果 data_format='channels_last'，则 4D 张量的形状为 (samples, rows, cols, channels)。

输出形状

输出形状与输入形状相似。

SpatialDropout3D

这是 SpatialDropout3D 版本，它执行与 dropout 类似的功能，但它丢弃完整的 3D 特征图而不是任何特定的元素。如果特征图中的相邻体素像卷积层一样强相关，则常规 dropout 不会正则化激活，否则会降低有效学习率。它还支持特征图之间的独立性。

参数

rate： 介于 0 和 1 之间的浮点数。要丢弃的输入单元的比例。
data_format： 它有两种模式，即 'channels_first' 或 'channels_last'，使得 'channels_first' 中的通道维度位于索引 1，在 'channels_last' 的情况下，它位于索引 4。它默认为 Keras 配置文件 ~/.keras/keras.json 中的 image_data_format 值。如果在该文件夹中找不到，则为 "channels_last"。

输入形状

如果 data_format='channels_first'，则 5D 张量的形状为：(samples, channels, dim1, dim2, dim3)，否则，如果 data_format='channels_last'，则形状为 (samples, dim1, dim2, dim3, channels)。

输出形状

输出形状与输入形状相同。

下一主题卷积层

Keras 核心层

Dense

激活

Dropout

Flatten

输入

Reshape

Permute

RepeatVector

Lambda

ActivityRegularization

Masking

SpatialDropout1D

SpatialDropout2D

SpatialDropout3D

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

Keras 教程

Keras 模型

Keras 层

深度学习库

Keras 核心层

Dense

激活

Dropout

Flatten

输入

Reshape

Permute

RepeatVector

Lambda

ActivityRegularization

Masking

SpatialDropout1D

SpatialDropout2D

SpatialDropout3D

相关帖子

Keras 合并层

卷积层

循环层

嵌入层

池化层

局部连接层

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器