如何使用 Python 提取 YouTube 数据(使用 BeautifulSoup 进行数据抓取)2025年3月17日 | 阅读 7 分钟 网络抓取是一种从网站提取数据的方法。它可以帮助我们收集或复制特定数据,然后我们可以将数据存储到数据库或电子表格中以供以后分析或检索。 Python 提供了 BeautifulSoup 库,该库广泛用于抓取其他网站的数据。如果您不熟悉 BeautifulSoup 库,可以从我们的 Python 网络抓取 教程中学习。在这里,您将详细了解 BeautifulSoup 库。 在本教程中,我们将讨论如何从 YouTube 提取数据并从中获得有用的见解。我们将使用 BeautifulSoup 库和 HTML 解析器。 YouTube 是全球最大的视频分享平台,拥有数十亿用户。许多创作者赚取了数百万。获取热门频道信息将是一个好主意。我们可以跟踪一些热门频道、订阅者、视频观看次数、点赞和点踩次数。 注意 - YouTube 会不时更改其源代码。因此,在某些情况下它可能不起作用。您也可以选择使用 YouTube API 来提取数据。依赖项安装为了完成这项任务,我们需要进行一些安装。首先,使用以下命令安装 BeautifulSoup 库。 现在,在命令行终端中安装请求的 HTML 解析器。 现在我们准备好继续从 YouTube 获取数据了。 开始 YouTube 数据抓取我们将通过一个快速的实验脚本来动手操作,该脚本将帮助我们提取此类数据。创建一个新的 Python 文件或在 shell 中键入以下代码。 示例 - 输出 [<meta content="IE=edge" http-equiv="X-UA-Compatible"/>, <meta content="rgba(255,255,255,0.98)" name="theme-color"/>, <meta content="What If You Lived on Uranus?" name="title"/>, <meta content="Freezing cold, dark and with an atmosphere that smells like farts and rotten eggs? Uranus doesn?t sound like somewhere you?d want to call home. But let?s sa..." name="description"/>, <meta content="what if, what happens if, scifi, science documentary, what if scenario, mysteries, what if cosmos, cosmos, hypothetical, hypothetical scenario, hypothetical scenarios, cold, dark, atmosphere, fart, farts, rotten eggs, rotten egg, uranus, home, settlement, home settlement, solar system, planet, temperature, gas, gas giants, cold planet, methane, uranus planet, uranus facts, rings of uranus, facts about uranus, space, nasa, earth vs uranus, space photography, space exploration, universe, solar" name="keywords"/>, <meta content="YouTube" property="og:site_name"/>, <meta content="https://www.youtube.com/watch?v=u4N45v8f7cY" property="og:url"/>, <meta content="What If You Lived on Uranus?" property="og:title"/>, <meta content="https://i.ytimg.com/vi/u4N45v8f7cY/maxresdefault.jpg" property="og:image"/>, <meta content="1280" property="og:image:width"/>, <meta content="720" property="og:image:height"/>, <meta content="Freezing cold, dark and with an atmosphere that smells like farts and rotten eggs? Uranus doesn?t sound like somewhere you?d want to call home. But let?s sa..." property="og:description"/>, <meta content="544007664" property="al:ios:app_store_id"/>, <meta content="YouTube" property="al:ios:app_name"/>, <meta content="vnd.youtube://www.youtube.com/watch?v=u4N45v8f7cY&feature=youtu.be&feature=applinks" property="al:ios:url"/>, <meta content="vnd.youtube://www.youtube.com/watch?v=u4N45v8f7cY&feature=youtu.be&feature=applinks" property="al:android:url"/>, <meta content="http://www.youtube.com/watch?v=u4N45v8f7cY&feature=youtu.be&feature=applinks" property="al:web:url"/>, <meta content="video.other" property="og:type"/>, <meta content="https://www.youtube.com/embed/u4N45v8f7cY" property="og:video:url"/>, <meta content="https://www.youtube.com/embed/u4N45v8f7cY" property="og:video:secure_url"/>, <meta content="text/html" property="og:video:type"/>, <meta content="1280" property="og:video:width"/>, <meta content="720" property="og:video:height"/>, <meta content="YouTube" property="al:android:app_name"/>, <meta content="com.google.android.youtube" property="al:android:package"/>, <meta content="what if" property="og:video:tag"/>, <meta content="what happens if" property="og:video:tag"/>, <meta content="scifi" property="og:video:tag"/>, <meta content="science documentary" property="og:video:tag"/>, <meta content="what if scenario" property="og:video:tag"/>, <meta content="mysteries" property="og:video:tag"/>, <meta content="what if cosmos" property="og:video:tag"/>, <meta content="cosmos" property="og:video:tag"/>, <meta content="hypothetical" property="og:video:tag"/>, <meta content="hypothetical scenario" property="og:video:tag"/>, <meta content="hypothetical scenarios" property="og:video:tag"/>, <meta content="cold" property="og:video:tag"/>, <meta content="dark" property="og:video:tag"/>, <meta content="atmosphere" property="og:video:tag"/>, <meta content="fart" property="og:video:tag"/>, <meta content="farts" property="og:video:tag"/>, <meta content="rotten eggs" property="og:video:tag"/>, <meta content="rotten egg" property="og:video:tag"/>, <meta content="uranus" property="og:video:tag"/>, <meta content="home" property="og:video:tag"/>, <meta content="settlement" property="og:video:tag"/>, <meta content="home settlement" property="og:video:tag"/>, <meta content="solar system" property="og:video:tag"/>, <meta content="planet" property="og:video:tag"/>, <meta content="temperature" property="og:video:tag"/>, <meta content="gas" property="og:video:tag"/>, <meta content="gas giants" property="og:video:tag"/>, <meta content="cold planet" property="og:video:tag"/>, <meta content="methane" property="og:video:tag"/>, <meta content="uranus planet" property="og:video:tag"/>, <meta content="uranus facts" property="og:video:tag"/>, <meta content="rings of uranus" property="og:video:tag"/>, <meta content="facts about uranus" property="og:video:tag"/>, <meta content="space" property="og:video:tag"/>, <meta content="nasa" property="og:video:tag"/>, <meta content="earth vs uranus" property="og:video:tag"/>, <meta content="space photography" property="og:video:tag"/>, <meta content="space exploration" property="og:video:tag"/>, <meta content="universe" property="og:video:tag"/>, <meta content="solar" property="og:video:tag"/>, <meta content="87741124305" property="fb:app_id"/>, <meta content="player" name="twitter:card"/>, <meta content="@youtube" name="twitter:site"/>, <meta content="https://www.youtube.com/watch?v=u4N45v8f7cY" name="twitter:url"/>, <meta content="What If You Lived on Uranus?" name="twitter:title"/>, <meta content="Freezing cold, dark and with an atmosphere that smells like farts and rotten eggs? Uranus doesn?t sound like somewhere you?d want to call home. But let?s sa..." name="twitter:description"/>, <meta content="https://i.ytimg.com/vi/u4N45v8f7cY/maxresdefault.jpg" name="twitter:image"/>, <meta content="YouTube" name="twitter:app:name:iphone"/>, <meta content="544007664" name="twitter:app:id:iphone"/>, <meta content="YouTube" name="twitter:app:name:ipad"/>, <meta content="544007664" name="twitter:app:id:ipad"/>, <meta content="vnd.youtube://www.youtube.com/watch?v=u4N45v8f7cY&feature=youtu.be&feature=applinks" name="twitter:app:url:iphone"/>, <meta content="vnd.youtube://www.youtube.com/watch?v=u4N45v8f7cY&feature=youtu.be&feature=applinks" name="twitter:app:url:ipad"/>, <meta content="YouTube" name="twitter:app:name:googleplay"/>, <meta content="com.google.android.youtube" name="twitter:app:id:googleplay"/>, <meta content="https://www.youtube.com/watch?v=u4N45v8f7cY" name="twitter:app:url:googleplay"/>, <meta content="https://www.youtube.com/embed/u4N45v8f7cY" name="twitter:player"/>, <meta content="1280" name="twitter:player:width"/>, <meta content="720" name="twitter:player:height"/>, <meta content="What If You Lived on Uranus?" itemprop="name"/>, <meta content="Freezing cold, dark and with an atmosphere that smells like farts and rotten eggs? Uranus doesn?t sound like somewhere you?d want to call home. But let?s sa..." itemprop="description"/>, <meta content="False" itemprop="paid"/>, <meta content="UCphTF9wHwhCt-BzIq-s4V-g" itemprop="channelId"/>, <meta content="u4N45v8f7cY" itemprop="videoId"/>, <meta content="PT6M50S" itemprop="duration"/>, <meta content="False" itemprop="unlisted"/>, <meta content="1280" itemprop="width"/>, <meta content="720" itemprop="height"/>, <meta content="HTML5 Flash" itemprop="playerType"/>, <meta content="1280" itemprop="width"/>, <meta content="720" itemprop="height"/>, <meta content="true" itemprop="isFamilyFriendly"/>, <meta content="AD,AE,AF,AG,AI,AL,AM,AO,AQ,AR,AS,AT,AU,AW,AX,AZ,BA,BB,BD,BE,BF,BG,BH,BI,BJ,BL,BM,BN,BO,BQ,BR,BS,BT,BV,BW,BY,BZ,CA,CC,CD,CF,CG,CH,CI,CK,CL,CM,CN,CO,CR,CU,CV,CW,CX,CY,CZ,DE,DJ,DK,DM,DO,DZ,EC,EE,EG,EH,ER,ES,ET,FI,FJ,FK,FM,FO,FR,GA,GB,GD,GE,GF,GG,GH,GI,GL,GM,GN,GP,GQ,GR,GS,GT,GU,GW,GY,HK,HM,HN,HR,HT,HU,ID,IE,IL,IM,IN,IO,IQ,IR,IS,IT,JE,JM,JO,JP,KE,KG,KH,KI,KM,KN,KP,KR,KW,KY,KZ,LA,LB,LC,LI,LK,LR,LS,LT,LU,LV,LY,MA,MC,MD,ME,MF,MG,MH,MK,ML,MM,MN,MO,MP,MQ,MR,MS,MT,MU,MV,MW,MX,MY,MZ,NA,NC,NE,NF,NG,NI,NL,NO,NP,NR,NU,NZ,OM,PA,PE,PF,PG,PH,PK,PL,PM,PN,PR,PS,PT,PW,PY,QA,RE,RO,RS,RU,RW,SA,SB,SC,SD,SE,SG,SH,SI,SJ,SK,SL,SM,SN,SO,SR,SS,ST,SV,SX,SY,SZ,TC,TD,TF,TG,TH,TJ,TK,TL,TM,TN,TO,TR,TT,TV,TW,TZ,UA,UG,UM,US,UY,UZ,VA,VC,VE,VG,VI,VN,VU,WF,WS,YE,YT,ZA,ZM,ZW" itemprop="regionsAllowed"/>, <meta content="153196" itemprop="interactionCount"/>, <meta content="2022-02-26" itemprop="datePublished"/>, <meta content="2022-02-26" itemprop="uploadDate"/>, <meta content="Science & Technology" itemprop="genre"/>] 这里似乎有很多数据,但它将帮助我们获得一些有用的数据。现在,让我们获取视频标题和视频信息。 输出 Title of the Video: What If You Lived on Uranus? Views on the Video: 153544 所以这是一个提取视频元数据并提取视频信息的简单示例。现在,我们将创建一个 Python 脚本来获取 YouTube 的一些信息。 用于获取 YouTube 信息的 Python 脚本首先,我们导入所需的库。 为了进一步进行,我们创建 HTTP 会话。 现在,我们创建一个函数,该函数将以字典的形式返回所有数据。 在上述方法中,我们下载了网页的 HTML 代码。render() 方法执行 JavaScript,将渲染后的数据放入 HTML 中。 注意 - 渲染网页的默认值为 8 秒。如果代码抛出超时错误,只需添加 timeout 参数并将其设置为 60 秒。现在,我们可以获取视频标题、描述、点赞数等。 要获取点赞数,我们需要导入 re 和 json 模块。我们使用 re 在网页中查找模式。完成后,我们使用 loads() 方法将序列化数据加载到 json 中。 完整代码 以下是提取视频详情的完整代码。 输出 {'title': 'What If You Lived on Uranus?', 'views': '153838', 'description': 'Freezing cold, dark and with an atmosphere that smells like farts and rotten eggs? Uranus doesn't sound like somewhere you'd want to call home. But let's sa...', 'date_published': '2022-02-26', 'duration': '6:49', 'tags': 'what if, what happens if, scifi, science documentary, what if scenario, mysteries, what if cosmos, cosmos, hypothetical, hypothetical scenario, hypothetical scenarios, cold, dark, atmosphere, fart, farts, rotten eggs, rotten egg, uranus, home, settlement, home settlement, solar system, planet, temperature, gas, gas giants, cold planet, methane, uranus planet, uranus facts, rings of uranus, facts about uranus, space, nasa, earth vs uranus, space photography, space exploration, universe, solar', 'likes': '6246', 'channel': {'name': 'What If', 'url': 'https://www.youtube.com/UCphTF9wHwhCt-BzIq-s4V-g', 'subscribers': '6.08 million subscribers'}} 结论本教程涵盖了提取 YouTube 视频信息。使用上述脚本,我们可以将任何 URL 传递给该函数并获取其数据。我们也可以将其保存到电子表格中。 |
我们请求您订阅我们的新闻通讯以获取最新更新。