使用Python进行YouTube视频摘要

2025年1月4日 | 阅读 4 分钟

引言

我想知道你是否也一样，但每当我空闲时，我经常花费数小时在 YouTube 上观看各种各样的电影。视频，例如“成功的 7 个秘诀”、“10 个最有用的机器学习工具”，甚至“伦敦最美的 5 个地方”也经常包含在内。

为了让视频播放时间更长并吸引更多观众，人们在您打开视频后，并没有直接告诉您想要知道的内容，而是开始了一个没完没了的独白。

然而，有时，似乎像变魔术一样，您会在评论中发现一位好心人总结了电影并为您提供了要点列表，这样您就不必浪费三十分钟（或十五分钟两次）盯着它看了！

因此，有一天我有了这个想法：“既然我擅长机器学习，我不能让这些视频自动被总结吗？”

在本文中，我将讨论我尝试构建一个功能性但略有缺陷的小型 Python 程序。

从 YouTube 下载音频

我们必须首先弄清楚如何下载 YouTube 视频。实际上，我们只需要音频，而不需要整个视频。因此，在从视频中提取音频后，我们将仅下载音频。

因此，我们使用 pip 安装库，并使用以下方法从 YouTube 获取音频。

 
!pip install pytube -q 
from pytube import YouTube
# Specify the YouTube video URL
VIDEO_URL = 'https://www.youtube.com/watch?v=h-JVjs9AAmQ' # Example video
# Download only the audio stream as an mp4 file
yt = YouTube(VIDEO_URL)
yt.streams.filter(only_audio=True, 
file_extension='mp4').first().download(filename='ytaudio.mp4')   

说明

该脚本使用 pytube 库从特定的 YouTube 视频下载 MP4 格式的音频。首先从 Python 导入 YouTube 类，并指定视频的 URL。yt.filter(file_extension='mp4', only_audio=True).initially().The download(filename='ytaudio.mp4') 行通过仅按文件名 ytaudio.mp4 过滤可用流来下载 MP4 格式的音频。

将 MP4 转换为 WAV 并检查音频

音频文件是否已正确下载？通过直接从笔记本发送音频，让我们来验证一下。

 
# Convert the downloaded audio file from mp4 to wav format using ffmpeg
!ffmpeg -i ytaudio.mp4 -acodec pcm_s16le -ar 16000 ytaudio.wav
# Check the audio sample rate using librosa
import librosa
input_file = 'ytaudio.wav'
print(librosa.get_samplerate(input_file))   

说明

此脚本使用 ffmpeg（具有特定的音频编解码器 pcm_s16le）和 16 kHz 的采样率将 MP4 音频文件转换为 WAV 格式。使用 librosa 库，它使用转换后的 WAV 文件来测量其音频采样率。转换命令确保音频与需要给定采样率和 WAV 格式的程序兼容。

音频到文本

为了实现低的词错误率，必须将录音转换为文本。这将很有帮助，因为文本可以被 NLP 算法立即处理以进行摘要。

有关我们将用于文本到文本转换的模型，可以在此处找到更多信息。

 
!pip install huggingsound -q 
from huggingsound import SpeechRecognitionModel
import torch
# Set the device to GPU if available, otherwise CPU
device = "cuda" if torch.cuda.is_available() else "cpu"
# Initialize the speech recognition model
model = SpeechRecognitionModel("jonatasgrosman/wav2vec2-large-xlsr-53-english", device=device)
# Stream over 30-second chunks rather than load the full file
stream = librosa.stream(input_file, block_length=30, frame_length=16000, hop_length=16000)
import soundfile as sf
# Save each chunk as a separate wav file
for i, speech in enumerate(stream):
    sf.write(f'{i}.wav', speech, 16000)
# Transcribe each chunk
audio_path = [f'{i}.wav' for i in range(len(stream))]
transcriptions = model.transcribe(audio_path)
# Combine the transcriptions into a single text
full_transcript = ' '.join([item['transcription'] for item in transcriptions])   

说明

使用预训练的 Wav2Vec2 模型，此应用程序使用 huggingsound 库进行语音到文本转录。在设置好计算设备（如果可用则为 GPU，否则为 CPU）后，便初始化语音识别模型。使用 librosa 库，音频文件以 30 秒的段落进行处理，并为每个片段存储为单独的 WAV 文件。之后，处理每个 WAV 文件，并将所有转录文本合并成一个单独的文本字符串，以创建音频文件的完整转录。

文本摘要

唯一剩下要做的就是总结我们从电影中提取的文本。

只需在摘要按钮上选择 hugging face 过滤器，即可从数百种可用模型中选择最适合您情况的摘要模型。

我将在本项目中使用 Google/Pegasus-Xsum 方法。模型的具体信息可以在此处找到；我将在未来的出版物中讨论这些摘要方法的理论。

使用 HugginFace 的这些预训练模型非常容易；只需看看我在几行代码中如何使用摘要即可。

 
from transformers import pipeline
# Initialize the summarization model
summarizer = pipeline("summarization", "google/pegasus-xsum")
# Summarize the text in chunks of 1000 characters
num_iters = len(full_transcript) // 1000
summarized_text = []
for i in range(num_iters + 1):
    start = i * 1000
    end = (i + 1) * 1000
    summary_chunk = summarizer(full_transcript[start:end], min_length=5, max_length=20)
    summarized_text.append(summary_chunk[0]['summary_text'])
# Combine the summarized chunks
final_summary = ' '.join(summarized_text)
print(final_summary)   

说明

此应用程序将大文本分解成 1000 个字符的片段，并使用 Google Pegasus XSum 模型对其进行摘要。通过迭代地将每个片段分解成一个简洁的摘要，然后将这些摘要编译成原始文本的最终、浓缩版本。

下一主题Python 数字

使用Python进行YouTube视频摘要

引言

从 YouTube 下载音频

将 MP4 转换为 WAV 并检查音频

音频到文本

文本摘要

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

其他

使用Python进行YouTube视频摘要

引言

从 YouTube 下载音频

将 MP4 转换为 WAV 并检查音频

音频到文本

文本摘要

相关帖子

Python中的默认参数

Python 缩进

Python停用词

如何在Python中获取左侧用零填充的字符串

Python Requests - 处理重定向

Python中的单纯形算法

如何使用Python截屏

Python字典是线程安全的吗？

使用Pandas在Python中将列转换为Int

Python中的Great Tables入门

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器