TensorFlow 中的风格迁移

17 Mar 2025 | 6 分钟阅读

神经风格迁移 (NST) 指的是一类软件算法，用于处理数字图像或视频，或采用另一幅图像的外观或视觉风格。在我们实现该算法时，我们定义了两个距离；一个用于内容 (Dc)，另一个用于形式 (Ds)。

在本主题中，我们将实现一个基于深度神经网络的人工系统，该系统将创建具有高感知质量的图像。该系统将使用神经表示来分离、重组内容图像（风格图像）作为输入，并返回内容图像，因为它使用风格图像的艺术风格进行打印。

神经风格迁移是一种优化技术，主要用于获取两张图像 - 内容图像和风格参考图像并将其混合。因此，输出图像看起来像内容图像，以匹配内容图像的内容统计信息和风格参考图像的风格统计信息。这些统计信息是使用卷积网络从图像中导出的。

神经风格迁移算法的工作原理

当我们实现给定的算法时，我们定义了两个距离；一个用于风格 (Ds)，另一个用于内容 (Dc)。 Dc 测量两幅图像之间内容的差异，Ds 测量两幅图像之间风格的差异。我们获得第三张图像作为输入，并将其转换为既最小化其与内容图像的内容距离，又最小化其与风格图像的风格距离。

所需库

import tensorflow as tf  
#we transform and models because we will modify our images and we will use pre-trained model VGG-19   
from torchvision import transforms, models  from PIL 
import Image  
import matplotlib.pyplot as plt  
import numpy as np  

VGG-19 模型

VGG-19 模型与 VGG-16 模型类似。 Simonyan 和 Zisserman 介绍了 VGG 模型。 VGG-19 在来自 ImageNet 数据库的超过一百万张图像上进行了训练。该模型具有 19 层的深度神经网络，可以将图像分类为 1000 个对象类别。

高级架构

神经风格迁移使用预先训练的卷积神经网络。然后定义一个损失函数，该函数绝对地混合两张图像以创建具有视觉吸引力的艺术品，NST 定义以下输入

内容图像 (c)- 我们想要将风格迁移到的图像
风格图像 (s)- 我们想要从中移动该方法的图像
输入图像 (g) - 包含最终结果的图像。

该模型的架构是相同的，并且计算的损失如下所示。我们不需要深入了解下图中的内容，因为我们将在接下来的几个部分中详细介绍每个组件。我们的想法是对风格迁移中发生的工作流程进行高层次的理解。

下载并加载预先训练的 VGG-16

我们将从该网页借用 VGG-16 权重。我们需要下载 vgg16_weights.npz 文件并将其替换到我们项目主目录中名为 vgg 的文件夹中。我们只需要卷积层和池化层。明确地说，我们将加载前七个卷积层以用作 NST 网络。我们可以使用笔记本中给出的 load_weights(...) 函数来做到这一点。

注意：我们必须尝试更多层。但要注意我们的 CPU 和 GPU 的内存限制。

# This function takes in a file path to the file containing weights
# and an integer that denotes how many layers to be loaded.
vgg_layers=load_weights(os.path.join('vgg','vgg16_weights.npz'),7)

定义用于构建风格迁移网络的函数

我们定义了几个函数，这些函数将在以后帮助我们完全定义给定输入的 CNN 的计算图。

创建 TensorFlow 变量

我们将 numpy 数组加载到 TensorFlow 变量中。我们正在创建以下变量

内容图像 (tf.placeholder)
风格图像 (tf.placeholder)
生成的图像 (tf.Variable and trainable=True)
预训练的权重和偏差 (tf.Variable and trainable=False)

确保我们使生成的图像可训练，同时保持预训练的权重以及权重和偏差冻结。我们展示了两个定义输入和神经网络权重的函数。

def define_inputs (input_shape):
"""
This function defines the inputs (placeholders) and image to be generated (variable)
"""
content = tf.placeholder(name='content' , shape=input_shape, dtype=tf.float32)
style= tf.placeholder(name='style', shape=input_shape, dtype=tf.float32)
generated= tf.get_variable(name='generated', initializer=tf.random_normal_initalizer=tf.random_normal_initiallizer(), shape=input_shape, dtype=tf.float32, trainable=true)
return {'content':content,'style,'generated': generated}
def define_tf_weights():
"""
This function defines the tensorflow variables for VGG weights and biases
"""
for k, w_dict in vgg_layers.items():
w, b=w_dict['weights'], w_dict['bias']
with tf.variable_scope(k):
  tf.get_variable(name='weights', initializer=tf.constant(w, dtype=tf.float32), trainable=false)
tf.get_variable(name='bias', initializer=tf.constant(b, dtype=tf.float32), trainable=False)

计算 VGG 网络输出

Computing the VGG net output
Def build_vggnet(inp, layer_ids, pool_inds, on_cpu=False):
"This function computes the output of full VGG net """
    outputs = OrderedDict()
    
    out = inp


for lid in layer_ids:
        with tf.variable_scope(lid, reuse=tf.AUTO_REUSE):
            print('Computing outputs for the layer {}'.format(lid))
            w, b = tf.get_variable('weights'), tf.get_variable('bias')
            out = tf.nn.conv2d(filter=w, input=out, strides=[1,1,1,1], padding='SAME')
out = tf.nn.relu(tf.nn.bias_add(value=out, bias=b))
            outputs[lid] = out


        if lid in pool_inds:
            with tf.name_scope(lid.replace('conv','pool')):
                out = tf.nn.avg_pool(input=out, ksize=[1,2,2,1], strides=[1, 2, 2, 1], padding='SAME')
                outputs[lid.replace('conv','pool')] = out


return outputs

损失函数

在本节中，我们定义了两个损失函数；风格损失函数和内容函数。内容损失函数确保生成的图像和内容图像之间的高层激活相似。

内容代价函数

内容代价函数确保内容图像中存在的内容被捕获到生成的图像中。已经发现 CNN 在更高的层次上捕获有关内容的信息，而较低的层次更侧重于单像素值。

令 A^l_{ij}(I) 是使用图像 I 获得的第 l 层、第 i 个特征图和第 j 个位置的激活。那么内容损失定义为

内容损失背后的直觉

如果我们可视化神经网络学习到的内容，则有证据表明，在存在各种对象时，更高层中的不同特征图会被激活。因此，如果两张图像具有相同的内容，则它们在顶层中具有相似的激活。

我们将内容代价定义如下。

def define_content_loss(inputs, layer_ids, pool_inds, c_weight):
c_outputs= build_vggnet (inputs ["content"], layer_ids, pool_inds)
g_outputs= build_vggnet (inputs ["generated"], layer_ids, pool_inds)
content_loss= c_weight * tf.reduce_mean(0.5*(list(c_outputs.values())[-1]-list(g_outputs.values())[-1])**2)

风格损失函数

它定义了需要更多工作的风格损失函数。为了从 VGG 网络中获得风格信息，我们将使用 CNN 的完整层。风格信息是衡量一层中特征图之间存在的相关性量。从数学上讲，风格损失定义为，

风格损失背后的直觉

通过上述等式系统，这个想法很简单。主要目标是计算原始图像和风格图像的风格矩阵。

然后，风格损失被定义为两个风格矩阵之间的均方根差。

	   def define_style_matrix(layer_out):
 """
	This function computes the style matrix, which essentially computes
	how correlated the activations of a given filter to all the other filers.
	Therefore, if there are C channels, the matrix will be of size C x C
	"""
	n_channels = layer_out.get_shape().as_list()[-1]
	unwrapped_out = tf.reshape(layer_out, [-1, n_channels])
	 style_matrix = tf.matmul(unwrapped_out, unwrapped_out, transpose_a=True)
	return style_matrix
	
	def define_style_loss(inputs, layer_ids, pool_inds, s_weight, layer_weights=None):
	 """ 
	This function computes the style loss using the style matrix computed for
	 the style image and the generated image 
	 """ 
	c_outputs = build_vggnet(inputs["style"], layer_ids, pool_inds)
	g_outputs = build_vggnet(inputs["generated"], layer_ids, pool_inds)
	
	 c_grams = [define_style_matrix(v) for v in list(c_outputs.values())]
	g_grams = [define_style_matrix(v) for v in list(g_outputs.values())]
	    
	    if layer_weights is None:
	        style_loss =  s_weight * \
	            tf.reduce_sum([(1.0/len(layer_ids)) * tf.reduce_mean((c - g)**2) for c,g in zip(c_grams, g_grams)])
	    else:
	        style_loss = s_weight * \

下一个主题风格迁移中的 Gram 矩阵

我们提供所有技术（如 Java 教程、Android、Java 框架）的教程和面试问题

联系信息

G-13, 2nd Floor, Sec-3, Noida, UP, 201301, India

hr@tpointtech.com

+91-9599086977

关注我们

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

TensorFlow 教程

TensorFlow 基础

TensorFlow 感知器

TensorFlow 中的 ANN

线性回归

TensorFlow 中的 CNN

TensorFlow 中的 RNN

风格迁移

TensorBoard

差异

目标检测

TensorFlow 调试

其他主题

TensorFlow 中的风格迁移

神经风格迁移算法的工作原理

所需库

VGG-19 模型

高级架构

下载并加载预先训练的 VGG-16

注意： 我们必须尝试更多层。 但要注意我们的 CPU 和 GPU 的内存限制。

定义用于构建风格迁移网络的函数

计算 VGG 网络输出

损失函数

内容代价函数

内容损失背后的直觉

风格损失函数

风格损失背后的直觉

相关帖子

风格迁移的过程

风格迁移的工作原理

风格迁移中的 Gram 矩阵

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器

注意：我们必须尝试更多层。但要注意我们的 CPU 和 GPU 的内存限制。