机器学习中的转置卷积

2025年6月19日 | 阅读 4 分钟

转置卷积也称为反卷积或上采样卷积。它主要应用于计算机视觉领域，用于图像生成、分割和超分辨率。与标准卷积“扁平化”输入数据的空间维度，对其进行下采样不同，转置卷积扩展空间分辨率以重建更精细的细节。因此，转置卷积是各种神经网络架构（例如自编码器、生成对抗网络和 U-Net 模型）不可或缺的一部分。

机器学习中转置卷积的工作原理

标准卷积通过对输入数据的重叠区域求和来应用过滤器（也称为核），从而创建较小的特征图。与卷积相反，转置卷积通过在元素之间插入“间隙”来使用小输入将其转换为大输出。这些间隙为过滤器在更宽的区域上滑动创造了空间，从而产生更大、上采样的输出。

从技术上讲，转置卷积是常规卷积矩阵操作的逆运算，因此得名转置。这并不意味着它只是简单地“取消应用”原始卷积，而是它在空间区域中拉伸输入以形成扩展的表示。

现在我们将在机器学习中应用转置卷积来生成图像。

 
# Import the necessary libraries
import torch
from torch import nn

# Input
Input = torch.tensor([[0.0, 1.0], [2.0, 3.0]])
#Kernel
Kernel = torch.tensor([[4.0, 1.0], [2.0, 3.0]])

# Redefine the shape in 4 dimension
Input = Input.reshape(1, 1, 2, 2)
Kernel = Kernel.reshape(1, 1, 2, 2)

# Transpose convolution Layer
Transpose = nn.ConvTranspose2d(in_channels =1, 
                               out_channels =1,
                               size_kernel=2, 
                               stride = 2, 
                               padding=0, 
                               bias=False)

# Kernel Initialisation
Transpose.weight.data = Kernel
# Output value
Transpose(Input)   

输出

Transposed Convolution in Machine Learning

这是一个转置卷积的应用，使用提供的核和步长在 4x4 矩阵上进行。该过程继续表明输入有效地从 2x2“扩展”，表明转置卷积有潜力抵消常规卷积引起的收缩效应——这在生成模型或图像分割任务中对上采样有很好的应用。

 
import torch
import torch.nn as nn

# Defining input images
image_input = torch.randn(1, 1, 4, 4)
print('Input Shape:',image_input.shape)
# defining kernel size
size_kernel = (3, 3)

# Define stride
stride = (2, 2)

# Define padding
padding = (1, 1)

# Define the transposed convolution layer
conv_transposed = nn.ConvTranspose2d(in_channels=1,
                                     out_channels=1,
                                     size_kernel=size_kernel, 
                                     stride=stride,
                                     padding=padding)

# Perform transposed convolution
output = conv_transposed(image_input)

# Display output
print("output \n", output)
print("\n output Shape", output.shape)   

输出

此代码在 4x4 输入上计算转置卷积，并产生 7x7 的输出大小。这就是转置卷积派上用场的地方，因为在许多此类任务（包括生成和分割）中，空间上采样通常是必要的。

现在我们将使用 PyTorch 对图像实现转置卷积。这首先通过 PIL 读取图像，将其转换为 4D 张量，其中输入有一个批处理维度，并且还有另一个常见的应用会自动添加此维度以与神经网络的大多数层兼容；然后，它定义了一个由实际值组成的自定义 3D 核，这些值将作为此“卷积”操作的过滤器。初始化一个 `ConvTranspose2d` 层，它具有 3 个输入通道（与 RGB 图像匹配）和 2 个输出通道。其核大小为 2x2，步长为 2 以对图像进行上采样，填充设置为 1。然后，必须手动设置核权重，如预定义的那样，并将输入图像通过转置卷积层。这会创建上采样输出图像，该图像的通道也将通过此转置卷积层进行更改；这演示了如何通过扩展空间维度来完成卷积操作。最后但同样重要的是，将输出转换回 PIL 图像以进行可视化。

 
# Inport the necessary module
from PIL import Image
import torch
from torch import nn
from torchvision import transforms

# Read input image
img = Image.open('/kaggle/input/ganesh/Ganesh.jpg')

# convert the input image to torch tensor
img = transforms.ToTensor()(img)
print("Input image size:", img.size())

# Unsqueeze the image to make it a 4D tensor
img = img.unsqueeze(0) 
print('unsqueeze Image size',img.shape)

#Kernel
Kernel = torch.tensor([
    [[[1.0,  0.1],[ 0.1, 0.2]],[[ 0.1, 0.2],[ 0.2,  0.3]],[[ 0, 0.1],[0.2, 0.3]]],
    [[[1.0,  0.1],[ 0.1, 0.2]],[[ 0.1, 0.2],[ 0.2,  0.3]],[[ 0, 0.1],[0.2, 0.3]]],
    [[[1.0,  0.1],[ 0.1, 0.2]],[[ 0.1, 0.2],[ 0.2,  0.3]],[[ 0, 0.1],[0.2, 0.3]]],
])

# Kernel shape
print('Kernel Size:', Kernel.shape)


# Transpose convolution Layer
Transpose = nn.ConvTranspose2d(in_channels =3, 
                               out_channels =2,
                               size_kernel=2, 
                               stride = 2, 
                               padding=1, 
                               bias=False)

# Initialize Kernel
Transpose.weight.data = Kernel

# Output value
img_second = Transpose(img)

# Squeeze image to make it 3D
img_second = img_second.squeeze(0)
print("Output image size:",img_second.size())

# Converting image to PIL image
img_second = transforms.ToPILImage()(img_second)

# displaying the image after convolution
img_second

输出

下一主题机器学习最佳笔记本电脑

机器学习中的转置卷积

机器学习中转置卷积的工作原理

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

机器学习

监督式学习

分类

杂项

相关教程

面试题

机器学习中的转置卷积

机器学习中转置卷积的工作原理

相关帖子

医学影像中的目标识别

Python 中的 Imbalanced Learn 模块

Transformer 注意力机制

对抗机器学习

机器学习中的超参数

在 PySpark DataFrame 中将单列拆分为多列

什么是 ImageNet 挑战 (ILSVRC)

机器学习中的数据质量是什么？

机器学习中的数值解

上下文多臂老虎机简介

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器