嗨喽,大家好呀~这里是爱看美女的茜茜呐
知识点:
-
动态数据抓包
-
requests发送请求
开发环境:
- python 3.8 运行代码 解释器
- pycharm 2022.3 辅助敲代码 编辑器
- requests pip install requests
👇 👇 👇 更多精彩机密、教程,尽在下方,赶紧点击了解吧~
python源码、视频教程、插件安装教程、资料我都准备好了,直接在文末名片自取就可
代码思路步骤
一. 思路分析
找到数据的来源
https://www.***.com/graphql
分析翻页的规律
二. 代码实现
-
发送请求 访问到数据来源
-
获取数据
-
提取数据 将其中的 视频链接 和 标题 全部提取出来
-
访问视频链接 拿到视频数据
-
保存视频
-
翻页获取视频
代码展示
'''
python资料获取看这里噢!! 小编 V:Pytho8987(记得好友验证备注:6 笔芯~)
即可获取:文章源码/教程/资料/解答等福利,还有不错的视频学习教程和PDF电子书!
'''
import requests # 第三方库
import os # 内置
if not os.path.exists('video'):
os.mkdir('video')
# 请求头 (告诉服务端我是谁)
headers = {
'Cookie': 'kpf=PC_WEB; clientid=3; did=web_279f6644708643f6590253c172444317; userId=3293066791; kuaishou.server.web_st=ChZrdWFpc2hvdS5zZXJ2ZXIud2ViLnN0EqABAUvl7mYJ4xsmuzfvkQzKKze_QzNmIN7a-XfNXqJbHNoJuiJqClkfn-LQ_9GJoW62jRigxMAT5qb0sTJNe9EGn6-4BmmitO-BbHbFQ3uEzwYDnVx039lB9kEkg-rhsXo-OiXIn2bzWcR9-WxFEMzOG5rpFged5V5QZhhD3b55S0uTsu0-nM6cRp9-KBNFhC551J_4psKOQT227BtmqSgt8BoSoJCKbxHIWXjzVWap_gGna5KjIiBJh6Op1qRW7iZ9naTe905G3rzeGh9DPQSTh_LVDnPvWygFMAE; kuaishou.server.web_ph=dc266d1649e270d70181a587bf584bcf975b; kpn=KUAISHOU_VISION',
'Referer': 'https://www.kuaishou.com/profile/3x3ie8ckzpzpzdq',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36'
}
url = 'https://www.----.com/graphql'
pcursor = ""
while True:
# 请求体(告诉服务端 我需要什么数据)
json = {
"operationName":"visionProfilePhotoList",
"variables":{
"userId":"3x66mtau2i999v6",
"pcursor":pcursor,
"page":"profile"
},
"query":"fragment photoContent on PhotoEntity {\n __typename\n id\n duration\n caption\n originCaption\n likeCount\n viewCount\n commentCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n musicBlocked\n riskTagContent\n riskTagUrl\n}\n\nfragment recoPhotoFragment on recoPhotoEntity {\n __typename\n id\n duration\n caption\n originCaption\n likeCount\n viewCount\n commentCount\n realLikeCount\n coverUrl\n photoUrl\n photoH265Url\n manifest\n manifestH265\n videoResource\n coverUrls {\n url\n __typename\n }\n timestamp\n expTag\n animatedCoverUrl\n distance\n videoRatio\n liked\n stereoType\n profileUserTopPhoto\n musicBlocked\n riskTagContent\n riskTagUrl\n}\n\nfragment feedContent on Feed {\n type\n author {\n id\n name\n headerUrl\n following\n headerUrls {\n url\n __typename\n }\n __typename\n }\n photo {\n ...photoContent\n ...recoPhotoFragment\n __typename\n }\n canAddComment\n llsid\n status\n currentPcursor\n tags {\n type\n name\n __typename\n }\n __typename\n}\n\nquery visionProfilePhotoList($pcursor: String, $userId: String, $page: String, $webPageArea: String) {\n visionProfilePhotoList(pcursor: $pcursor, userId: $userId, page: $page, webPageArea: $webPageArea) {\n result\n llsid\n webPageArea\n feeds {\n ...feedContent\n __typename\n }\n hostName\n pcursor\n __typename\n }\n}\n"
}
# 1. 发送请求 访问到数据来源
response = requests.post(url, json=json, headers=headers)
# 2. 获取数据
json_data = response.json()
# 3. 提取数据 将其中的 视频链接 和 标题 全部提取出来
feeds = json_data['data']['visionProfilePhotoList']['feeds'] # 列表
pcursor = json_data['data']['visionProfilePhotoList']['pcursor']
for feed in feeds: # 将列表里面的元素 挨个遍历出来
caption = feed['photo']['caption']
photoUrl = feed['photo']['photoUrl']
print(caption, photoUrl)
# # 4. 访问视频链接 拿到视频数据
# video_data = requests.get(photoUrl).content
# # 5. 保存视频
# # wb: 以二进制的方式覆盖写入数据
# # ab: 以二进制的方式追加写入数据
# with open(f'video/{caption}.mp4', mode='wb') as f:
# f.write(video_data)
if pcursor == 'no_more':
break
尾语
感谢你观看我的文章呐~本次航班到这里就结束啦 🛬
希望本篇文章有对你带来帮助 🎉,有学习到一点知识~
躲起来的星星🍥也在努力发光,你也要努力加油(让我们一起努力叭)。