文章目录
- 1. 第一步:安装requests库
- 2. 第二步:获取爬虫所需的header和cookie
- 3. 第三步:获取网页
- 4. 第四步:解析网页
- 5. 第五步:解析 json 结构数据体
- 6. 代码实例以及结果展示
python爬虫五部曲:
-
第一步:安装requests库
-
第二步:获取爬虫所需的header和cookie
-
第三步:获取网页
-
第四步:解析网页
-
第五步:分析得到的Json数据
1. 第一步:安装requests库
在程序中引用两个库的书写是这样的:
import requests
以pycharm为例,在pycharm上安装这个库的方法。在菜单【文件】–>【设置】->【项目】–>【Python解释器】中,在所选框中,点击软件包上的+号就可以进行查询插件安装了。有过编译器插件安装的hxd估计会比较好入手。具体情况就如下图所示。
2. 第二步:获取爬虫所需的header和cookie
以爬取 雪球网 自选股 行情 的爬虫程序为例。获取header和cookie是一个爬虫程序必须的,它直接决定了爬虫程序能不能准确的找到网页位置进行爬取。
- 首先通过浏览器,打开雪球网,进行注册并登录,然后进入自选股页面,添加自己关注的股票
打开雪球网 https://xueqiu.com/
进入自选股页面,添加自己关注的股票
- [x]按下F12,就会出现网页的js语言设计部分,找到网页上的Network部分。并选中“放大镜(过滤)”,如下图所示:
- 然后按下ctrl+R刷新页面,此时发现右边 NetWork 部分出现很多信息。(如果进入后就有所需要的信息,就不用刷新了),当然刷新了也没啥问题。
- 在选中放大镜后的输入框中,输入其中一个关注的股票,比如:00418,通过 放大镜 搜索 功能,搜索自己关注的信息,如:通用电梯,并点击 ”刷新按钮“,就会在Search 结果中,显示相关的信息
双击选中上述查询结果中的某一条数据(上图中 红色框的部分),此时会发现右下部分 网络信息 也会同步自动选中某一行信息吗,如下图所示:
-
也可以通过 Network --> Filter 功能,搜索网址中的关键信息 的方式进行过滤
在 Network --> Filter 中,依据网页中的关键信息进行过滤,如输入: 00418
-
拷贝其 cURL 信息
在 Network --> Filter 中,关键信息进行过滤后,我们浏览Name这部分,找到我们想要爬取的文件(网络信息),鼠标右键,选择copy,复制下网页的URL。
过滤后,有效信息会少很多,如下所示。选中所需的条目,右键 --> Copy --> Copy as cURL
- 利用工具 Convert curl commands to code https://curlconverter.com/python/ 进行转换
转换后信息如下图所示,选择【Copy to clipboard】,并黏贴到Pycharm开发环境中即可直接使用:
转换后信息如下图所示,请关注:header 中的 传输格式为: Json
选择【Copy to clipboard】,并黏贴到Pycharm开发环境中即可直接使用:
选择【Copy to clipboard】, 拷贝到 pycharm 中,可直接作为源代码使用:
import requests
cookies = {
'cookiesu': '241712922404752',
'device_id': '6a73424ed3aae5c44aeb59b0ddfbc91b',
'Hm_lvt_1db88642e346389874251b5a1eded6e3': '1712922406',
's': 'bo11shkdxf',
'remember': '1',
'xq_a_token': '7ef03deb28d3396dc9d555329881fd9986211657',
'xqat': '7ef03deb28d3396dc9d555329881fd9986211657',
'xq_id_token': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMwNzEwOTczOTMsImNpZCI6ImQ5ZDBuNEFadXAifQ.it0vlCz3lYJQWuyvFGxWj3g-ACQjfUgLUdb5_XFqqgkd0U9rn1sLzHDmas6HB8po7WHut8dAFiOT8_JKJufjdIuzW_TC2wGWErmBuxNwvTR2Vm6wPzRU-HWwTHcZDuHO6nbvH0olaXqMcnRtx7b10o-V9w8FLUCqNhMw0t84d8dcSewNQtmbYSDunHRpUzu33uXyQOKdqSV8MVPmMKfhFxqSxbFienLwk-K3g8c02p9RKdclJLX77FjFt2LXUlfZbSBJ_8QVHzRgqCbXV_gcDF4akPgjCzSFNTD4xJd030N0J-SCH-WxMfqO5AQDBbo5G1Jupc2GdlHht_uvZ1dBbA',
'xq_r_token': '62e0ff828b86cfa501f1520ad6570a99838e72e5',
'xq_is_login': '1',
'u': '1878149071',
'is_overseas': '0',
'Hm_lpvt_1db88642e346389874251b5a1eded6e3': '1713076110',
}
headers = {
'authority': 'stock.xueqiu.com',
'accept': 'application/json, text/plain, */*',
'accept-language': 'zh-CN,zh;q=0.9',
# 'cookie': 'cookiesu=241712922404752; device_id=6a73424ed3aae5c44aeb59b0ddfbc91b; Hm_lvt_1db88642e346389874251b5a1eded6e3=1712922406; s=bo11shkdxf; remember=1; xq_a_token=7ef03deb28d3396dc9d555329881fd9986211657; xqat=7ef03deb28d3396dc9d555329881fd9986211657; xq_id_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJ1aWQiOjE4NzgxNDkwNzEsImlzcyI6InVjIiwiZXhwIjoxNzE1NjYzMDk3LCJjdG0iOjE3MTMwNzEwOTczOTMsImNpZCI6ImQ5ZDBuNEFadXAifQ.it0vlCz3lYJQWuyvFGxWj3g-ACQjfUgLUdb5_XFqqgkd0U9rn1sLzHDmas6HB8po7WHut8dAFiOT8_JKJufjdIuzW_TC2wGWErmBuxNwvTR2Vm6wPzRU-HWwTHcZDuHO6nbvH0olaXqMcnRtx7b10o-V9w8FLUCqNhMw0t84d8dcSewNQtmbYSDunHRpUzu33uXyQOKdqSV8MVPmMKfhFxqSxbFienLwk-K3g8c02p9RKdclJLX77FjFt2LXUlfZbSBJ_8QVHzRgqCbXV_gcDF4akPgjCzSFNTD4xJd030N0J-SCH-WxMfqO5AQDBbo5G1Jupc2GdlHht_uvZ1dBbA; xq_r_token=62e0ff828b86cfa501f1520ad6570a99838e72e5; xq_is_login=1; u=1878149071; is_overseas=0; Hm_lpvt_1db88642e346389874251b5a1eded6e3=1713076110',
'origin': 'https://xueqiu.com',
'referer': 'https://xueqiu.com/',
'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-site',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
}
response = requests.get(
'https://stock.xueqiu.com/v5/stock/batch/quote.json?symbol=SZ300229,SZ002129&extend=detail&is_delay_hk=true',
cookies=cookies,
headers=headers,
)
print(f'response= {response}')
print(f'response.text= {response.text}')
print(f'response.json= {response.json()}')
3. 第三步:获取网页
通过requests.get() 即可获取网页内容:
response = requests.get(
'https://blog.csdn.net/community/home-api/v1/get-business-list',
params=params,
cookies=cookies,
headers=headers,
)
print(f'response= {response}')
print(f'response.text= {response.text}')
print(f'response.json= {response.json()}')
4. 第四步:解析网页
由于 get 的结果,就是 json 数据,所以后续只需要针对 json格式进行解析即可:
格式化后应答数据内容如下:
{
"code": 200,
"message": "success",
"traceId": "06c167ac-bfb9-4a7c-b25d-f6de5a4dcf15",
"data": {
"list": [
{
"articleId": 137087163,
"title": "python爬虫 - 爬取图片",
"description": "1、下载图片示例1:使用 .urlretrieve() 函数 2、下载图片示例2 - 使用 open/write 函数 3、下载图片示例3 3.1 使用 open/write 下载 3.2 使用 urlretrieve下载",
"url": "https://blog.csdn.net/BullKing8185/article/details/137087163",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 536,
"commentCount": 1,
"editUrl": "https://editor.csdn.net/md?articleId=137087163",
"postTime": "2024-04-12 15:15:37",
"diggCount": 5,
"formatTime": "2024.04.12",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/3b8865a7c21d4b03b25a6726fe37d0f0.png"
],
"collectCount": 8
},
{
"articleId": 137665320,
"title": "python爬虫 - 爬取微博热搜数据",
"description": "1. 第一步:安装requests库和BeautifulSoup库 2. 第二步:获取爬虫所需的header和cookie 3. 第三步:获取网页 4. 第四步:解析网页 5. 第五步:分析得到的信息,简化地址 6. 第六步:爬取内容,清洗数据 7. 爬取微博热搜的代码实例以及结果展示",
"url": "https://blog.csdn.net/BullKing8185/article/details/137665320",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 1598,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=137665320",
"postTime": "2024-04-12 13:00:00",
"diggCount": 47,
"formatTime": "2024.04.12",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/3b8865a7c21d4b03b25a6726fe37d0f0.png"
],
"collectCount": 10
},
{
"articleId": 136887304,
"title": "python获取代码所在行号,输出到终端或日志文件中",
"description": "python获取代码所在行号,输出到终端或日志文件中 1、使用 sys 模块 2、使用inspect模块 3、使用linecache模块 4、使用traceback模块 5、使用enumerate函数",
"url": "https://blog.csdn.net/BullKing8185/article/details/136887304",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 630,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=136887304",
"postTime": "2024-04-11 14:05:04",
"diggCount": 23,
"formatTime": "2024.04.11",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 6
},
{
"articleId": 136886825,
"title": "python 实现从服务器下载文件",
"description": "python 实现从服务器下载文件 1、使用python paramiko库 2、使用Python wget库 3、使用Python urllib库 4、使用subprocess.run()执行scp命令 5、使用os.system() 执行scp命令",
"url": "https://blog.csdn.net/BullKing8185/article/details/136886825",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 421,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=136886825",
"postTime": "2024-04-11 09:30:00",
"diggCount": 3,
"formatTime": "2024.04.11",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 10
},
{
"articleId": 137545048,
"title": "python web 开发 - 基于tornado框架的 Hello World 示例",
"description": "python web 开发 - 基于tornado框架的 Hello World 示例 1、主要步骤 2、tornado 安装 3、创建程序 4、 运行程序 5、通过浏览器访问",
"url": "https://blog.csdn.net/BullKing8185/article/details/137545048",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 239,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=137545048",
"postTime": "2024-04-10 15:45:00",
"diggCount": 9,
"formatTime": "2024.04.10",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/9cc6d1e8d71c4f7b84829bf2b76f4528.png"
],
"collectCount": 4
},
{
"articleId": 137544781,
"title": "python web 开发 - 基于flask框架的 Hello World 示例",
"description": "python web 开发 - 基于flask框架的 Hello World 示例 1、主要步骤 2、flask 安装 3、创建程序 4、 运行程序 5、通过浏览器访问",
"url": "https://blog.csdn.net/BullKing8185/article/details/137544781",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 467,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=137544781",
"postTime": "2024-04-10 08:30:00",
"diggCount": 8,
"formatTime": "2024.04.10",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/ff9e522ef239408bb54947c4fe64ae0f.png"
],
"collectCount": 6
},
{
"articleId": 136939549,
"title": "python web 开发 - 通过venv虚拟环境,进行Flask安装",
"description": "python web 开发 - 通过venv虚拟环境,进行Flask安装 1、关于Flask 2、在Ubuntu 20.04上安装Flask 3、创建 Hello World",
"url": "https://blog.csdn.net/BullKing8185/article/details/136939549",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 1130,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=136939549",
"postTime": "2024-04-09 14:49:53",
"diggCount": 21,
"formatTime": "2024.04.09",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/ff9e522ef239408bb54947c4fe64ae0f.png"
],
"collectCount": 14
},
{
"articleId": 137545432,
"title": "python web 开发 - 常用Web框架",
"description": "python web 开发 - 1、关于Web开发 2、常用Web框架 3、开发案例 3.1. 使用Flask框架创建一个简单的Web应用程序 3.2. 使用tornado框架创建一个简单的Web应用程序 3.3. 使用Django框架创建一个简单的待办事项应用程序 4、总结",
"url": "https://blog.csdn.net/BullKing8185/article/details/137545432",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 1004,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=137545432",
"postTime": "2024-04-09 12:00:17",
"diggCount": 37,
"formatTime": "2024.04.09",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 17
},
{
"articleId": 136626181,
"title": "python界面开发 - filedialog 文件选择对话框",
"description": "1.Tkinter 开发2.filedialog 文件选择对话框3.python图形界面开发 3.1. Tkinter 3.2. PyQt 3.3. wxPython 3.4. PyGTK:基于GTK 3.5. Kivy 3.6. 可视化工具",
"url": "https://blog.csdn.net/BullKing8185/article/details/136626181",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 1042,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=136626181",
"postTime": "2024-03-11 15:55:51",
"diggCount": 16,
"formatTime": "2024.03.11",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 11
},
{
"articleId": 136619020,
"title": "python界面开发 - Canvas绘制图形",
"description": "1.Tkinter 开发2. Canvas绘制图形 2.1. 示例1:绘制矩形、椭圆和多边形 2.2. 示例2:绘制柱状图、折线图 2.3. 示例3:同时绘制多个画布3. python图形界面开发 3.1. Tkinter 3.2. PyQt 3.3. wxPython 3.4. PyGTK:基于GTK 3.5. Kivy 3.6. 可视化工具",
"url": "https://blog.csdn.net/BullKing8185/article/details/136619020",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 1076,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=136619020",
"postTime": "2024-03-11 11:35:03",
"diggCount": 35,
"formatTime": "2024.03.11",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 10
},
{
"articleId": 136569640,
"title": "python界面开发 - messagebox 提示框",
"description": "1.messagebox2.Tkinter 开发3. python图形界面开发 3.1. Tkinter 3.2. PyQt 3.3. wxPython 3.4. PyGTK:基于GTK 3.5. Kivy 3.6. 可视化工具",
"url": "https://blog.csdn.net/BullKing8185/article/details/136569640",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 624,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=136569640",
"postTime": "2024-03-09 11:00:00",
"diggCount": 24,
"formatTime": "2024.03.09",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 24
},
{
"articleId": 136503099,
"title": "python界面开发 - Checkbutton:复选框",
"description": "1. python图形界面开发 1.1. Tkinter 1.2. PyQt 1.3. wxPython 1.4. PyGTK:基于GTK 1.5. Kivy 1.6. 可视化工具2. Tkinter 开发3. Checkbutton:复选框",
"url": "https://blog.csdn.net/BullKing8185/article/details/136503099",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 669,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=136503099",
"postTime": "2024-03-09 07:00:00",
"diggCount": 15,
"formatTime": "2024.03.09",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 18
},
{
"articleId": 136502669,
"title": "python界面开发 - Label 提示框",
"description": "1. python图形界面开发 1.1. Tkinter 1.2. PyQt 1.3. wxPython 1.4. PyGTK:基于GTK 1.5. Kivy 1.6. 可视化工具2. Tkinter 开发3. Label 显示提示信息 3.1. 显示文本 3.2. 修改Label的文本 3.2. 设置背景图片",
"url": "https://blog.csdn.net/BullKing8185/article/details/136502669",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 783,
"commentCount": 0,
"editUrl": "https://editor.csdn.net/md?articleId=136502669",
"postTime": "2024-03-08 18:35:54",
"diggCount": 13,
"formatTime": "2024.03.08",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 25
},
{
"articleId": 136503057,
"title": "python界面开发 - Listbox:列表框",
"description": "1. python图形界面开发 1.1. Tkinter 1.2. PyQt 1.3. wxPython 1.4. PyGTK:基于GTK 1.5. Kivy 1.6. 可视化工具2. Tkinter 开发3. Listbox:用于创建列表框",
"url": "https://blog.csdn.net/BullKing8185/article/details/136503057",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 809,
"commentCount": 1,
"editUrl": "https://editor.csdn.net/md?articleId=136503057",
"postTime": "2024-03-08 11:00:00",
"diggCount": 12,
"formatTime": "2024.03.08",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 23
},
{
"articleId": 136503020,
"title": "python界面开发 - Menu (popupmenu) 右键菜单",
"description": "1. python图形界面开发 1.1. Tkinter 1.2. PyQt 1.3. wxPython 1.4. PyGTK:基于GTK 1.5. Kivy 1.6. 可视化工具2. Tkinter 开发3. Menu (popupmenu) 右键菜单 3.1. 示例1 3.2. 示例2",
"url": "https://blog.csdn.net/BullKing8185/article/details/136503020",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 1185,
"commentCount": 1,
"editUrl": "https://editor.csdn.net/md?articleId=136503020",
"postTime": "2024-03-08 05:00:00",
"diggCount": 19,
"formatTime": "2024.03.08",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 26
},
{
"articleId": 136502896,
"title": "python界面开发 - Combobox 下拉框",
"description": "1. Tkinter 开发2. Combobox 下拉框 2.1. 示例1 2.2. 示例13. python图形界面开发 3.1. Tkinter 3.2. PyQt 3.3. wxPython 3.4. PyGTK:基于GTK 3.5. Kivy 3.6. 可视化工具",
"url": "https://blog.csdn.net/BullKing8185/article/details/136502896",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 1092,
"commentCount": 1,
"editUrl": "https://editor.csdn.net/md?articleId=136502896",
"postTime": "2024-03-07 11:00:00",
"diggCount": 9,
"formatTime": "2024.03.07",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 21
},
{
"articleId": 136502764,
"title": "python界面开发 - OptionMenu菜单",
"description": "1. python图形界面开发 1.1. Tkinter 1.2. PyQt 1.3. wxPython 1.4. PyGTK:基于GTK 1.5. Kivy 1.6. 可视化工具2. Tkinter 开发3. OptionMenu 菜单 3.1. 示例1 3.2. 示例2",
"url": "https://blog.csdn.net/BullKing8185/article/details/136502764",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 895,
"commentCount": 1,
"editUrl": "https://editor.csdn.net/md?articleId=136502764",
"postTime": "2024-03-07 03:00:00",
"diggCount": 19,
"formatTime": "2024.03.07",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 29
},
{
"articleId": 136503160,
"title": "python界面开发 - Radiobutton:单选按钮",
"description": "1. python图形界面开发 1.1. Tkinter 1.2. PyQt 1.3. wxPython 1.4. PyGTK:基于GTK 1.5. Kivy 1.6. 可视化工具2. Tkinter 开发3. Radiobutton:单选按钮 3.1. 格式说明 3.2. 定义整数类型的值 3.3. 定义字符串类型的值 3.4. 获取值 3.5. 点击事件 3.6. 示例",
"url": "https://blog.csdn.net/BullKing8185/article/details/136503160",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 535,
"commentCount": 1,
"editUrl": "https://editor.csdn.net/md?articleId=136503160",
"postTime": "2024-03-06 13:18:24",
"diggCount": 24,
"formatTime": "2024.03.06",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 11
},
{
"articleId": 136502558,
"title": "python界面开发 - Button 按钮",
"description": "1. python图形界面开发 1.1. Tkinter 1.2. PyQt 1.3. wxPython 1.4. PyGTK:基于GTK 1.5. Kivy 1.6. 可视化工具2. Tkinter 开发3. Button 按钮 3.1. .command 属性 3.1.1. 示例1 : command=root.quit 3.1.2. 示例2 : root.master.destroy 3.2. 动态创建Button",
"url": "https://blog.csdn.net/BullKing8185/article/details/136502558",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 902,
"commentCount": 1,
"editUrl": "https://editor.csdn.net/md?articleId=136502558",
"postTime": "2024-03-06 12:45:51",
"diggCount": 30,
"formatTime": "2024.03.06",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 17
},
{
"articleId": 136470395,
"title": "python界面开发 - 窗体设置方法",
"description": "1. python图形界面开发 1.1. Tkinter 1.2. PyQt 1.3. wxPython 1.4. PyGTK:基于GTK 1.5. Kivy 1.6. 可视化工具2. Tkinter 开发3. 窗口设置方法 3.1. *.title(\"...\") 3.2. *.geometry(\"400x300\") 3.3. *.geometry(\"+100+100\") 3.4. *.iconbitmap(\"myicon.ico\") 3.5. *.state(\"...\")",
"url": "https://blog.csdn.net/BullKing8185/article/details/136470395",
"type": 1,
"top": false,
"forcePlan": false,
"viewCount": 633,
"commentCount": 1,
"editUrl": "https://editor.csdn.net/md?articleId=136470395",
"postTime": "2024-03-05 14:15:00",
"diggCount": 19,
"formatTime": "2024.03.05",
"picList": [
"https://img-blog.csdnimg.cn/img_convert/fad536e972e14ce4b37803185dc3b00c.png"
],
"collectCount": 9
}
],
"total": 33
}
}
5. 第五步:解析 json 结构数据体
json_content = response.json()
print(f'json_content.code = {json_content["code"]}')
print(f'json_content.message = {json_content["message"]}')
print(f'json_content.traceId = {json_content["traceId"]}')
print(f'json_content.data.total = {json_content["data"]["total"]}')
for item in json_content["data"]["list"]:
print(f'阅读量 = {item["viewCount"]}, '
f'收藏量 = {item["diggCount"]}, '
f'{item["title"]}')
6. 代码实例以及结果展示
import requests
cookies = {
'uuid_tt_dd': '10_20936681940-1687695659941-897712',
'UN': 'limeigui',
'ins_first_time': '1693811332416',
'_ga': 'GA1.1.1606228358.1692240870',
'_ga_7W1N0GEY1P': 'GS1.1.1698749086.7.1.1698749172.43.0.0',
'log_Id_click': '117',
'log_Id_view': '558',
'log_Id_pv': '378',
'Hm_lvt_e5ef47b9f471504959267fd614d579cd': '1708345516',
'ssxmod_itna': 'iqGxuDBD2AKrqGHqaWxUhGQqi=Z+xeDk+Dmg04GNpUYDZDiqAPGhDC38FmBm0jwIdf4804GCi2bqxaAS77gpAIKz2mpYD74i8DCqi1D0qDY+oxBLrbQoxiiyDCmFDPrKD32xlIzDvxG=D3qDFYqDLDMNDFqG0l+QPD0Pq+mDlD73DUwdDQqDSUQKKxGjDxitRDGADx0tUD7jD2eQDeMpTcqGW0wD2zBh8YYaSR=y4cjTiP6WW5cWU7ZnCaONVemQDbRLHweXfxQ0CNODvmAvzSGPqWDhw30Gc+7xeU+1SwA/riBqeRf+3YDDGbxeA4bix4D=',
'ssxmod_itna2': 'iqGxuDBD2AKrqGHqaWxUhGQqi=Z+xeDk+DmgDA6WmhxD/Q1DFr21/4Pgp7KAPPuKOBaiGcS0MH0QPvcM+RwKb9uvohWzcgyAOYiPq2NDgQdjoj/l8LYmsENvW2Ax0MEGC2eVUWk53KncOecGsZ7RYU2PWXW0TcABidaTWCDo1XCnrdYPFIAGIzgqHPYSN48+zk+p4IAErxE0tHrOIFokWcFdKt+o87hYQ1YYzhbIsEf7d78O0u7iAADObNMGIfZE+/YBr3mnIRK3Uiqwa4IguIDw1cfH9iTGTT6qgO8KXrczLohsiEZ+2GvH2A3Z4uGvAeelAYMODZnqKfmznxYzIOnx4l3eZP7hwuAdMl6Yh=Q35NSpa=IrK6bS50OwVl5=O5Mjt72PSnrsB5u7482pFPPlWtplXa6ihBrMiPwFq7CeeaXdU6G0n2Z2KjWAhaMeRvKNNRG8jfFcqqn0OYRn8IzOQbt0O9quQD1PeXWPY1=MEpu1KjSUg2I4roD8NDL=E0ePVyofFig7Hm4DQI4zi59hpwnjtqkUCIpcYBC+0hlnlyUhQ0q4UGzKdf6o=K0YD08DiQ4YD===',
'tfstk': 'eoFMjtVQpLYXhdd7Jlh19IAf0TBKBhGjcodxDjnVLDoC5S3OC-00xlvtHR3TKxqQAlHVfc30nDqLBIhaiJO0PrNO5jQs1PGjggIRwI4_5jNZPRglwtKlqssR2_CdQelcjgetASaRYOsdBv5aaND-qUxFULPB77oz7czTWW8IQdatxIR4tJDS4imHgIPnSglyLLoTlIgFkWJXhAuI-0QMz_Nc-ETuCwbHFOMZRVIR-wvX8AuI-kQh-LTKQ2gtp',
'c_dl_um': '-',
'UserName': 'limeigui',
'UserInfo': '3b95b21938904a148617bb63e4cd8b47',
'UserToken': '3b95b21938904a148617bb63e4cd8b47',
'UserNick': 'Adunn',
'AU': '95B',
'BT': '1709514903656',
'p_uid': 'U010000',
'Hm_up_6bcd52f51e9b3dce32bec4a3997715ac': '%7B%22islogin%22%3A%7B%22value%22%3A%221%22%2C%22scope%22%3A1%7D%2C%22isonline%22%3A%7B%22value%22%3A%221%22%2C%22scope%22%3A1%7D%2C%22isvip%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%2C%22uid_%22%3A%7B%22value%22%3A%22limeigui%22%2C%22scope%22%3A1%7D%7D',
'FCNEC': '%5B%5B%22AKsRol-81BVROqAOt-Krga723o1zn0lKQZXCHVYOsZGp4bbSYqIORsTWdRA33h_JQeCm1pUeYYkPLifSrDfR3ebNunu-COvYV2D4sqzbPrXD_tVp9je9p1aG1qgVGpkYlxpNK3mEnUaabXB6IvFt7xBeYqz_845gCA%3D%3D%22%5D%5D',
'c_dl_fref': 'https://www.baidu.com/link',
'c_dl_fpage': '/download/qq_27308505/21132392',
'c_dl_prid': '1711094139943_541172',
'c_dl_rid': '1711094192991_543576',
'limeiguicomment_new': '1706325449636',
'management_ques': '1712733666636',
'c_segment': '0',
'Hm_lvt_6bcd52f51e9b3dce32bec4a3997715ac': '1712450998,1712540146,1712628949,1713335190',
'_clck': 'v3yta0%7C2%7Cfl2%7C0%7C1559',
'log_Id_click': '118',
'c_pref': 'default',
'c_first_ref': 'default',
'dc_sid': '9a3d776b6aa46403c7fa216f2ec8d588',
'creative_btn_mp': '3',
'fpv': 'd7fa221d14d24b2361ad4f09bace739a',
'yd_captcha_token': 'dzp',
'dc_session_id': '10_1713523054888.848382',
'c_ref': 'default',
'log_Id_pv': '379',
'log_Id_view': '559',
'__gads': 'ID=79df0b17ce2ed235-22263755acb40040:T=1687695664:RT=1713524387:S=ALNI_MbKkmtHbLa1eh1RSXbuJOoatVdiiQ',
'__gpi': 'UID=00000c6ade25eb9f:T=1687695664:RT=1713524387:S=ALNI_MZP0oVAi-DQb_-PpFvwoGO0EYhHiQ',
'__eoi': 'ID=4a7618f393a07404:T=1706249283:RT=1713524387:S=AA-AfjbTLqbpP5gZ44TgXYWkx20B',
'_clsk': '1k295vw%7C1713524388493%7C4%7C0%7Cn.clarity.ms%2Fcollect',
'SidecHatdocDescBoxNum': 'true',
'c_first_page': 'https%3A//blog.csdn.net/BullKing8185%3Ftype%3Dblog',
'Hm_lpvt_6bcd52f51e9b3dce32bec4a3997715ac': '1713525601',
'waf_captcha_marker': '2a99ea8a0f9ac96620d0ca1ea34fe2e494a64e498e223e9b42719bcf5b4b099d',
'c_dsid': '11_1713525996492.411629',
'c_page_id': 'default',
'dc_tos': 'sc6t4c',
}
headers = {
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Connection': 'keep-alive',
# 'Cookie': 'uuid_tt_dd=10_20936681940-1687695659941-897712; UN=limeigui; ins_first_time=1693811332416; _ga=GA1.1.1606228358.1692240870; _ga_7W1N0GEY1P=GS1.1.1698749086.7.1.1698749172.43.0.0; log_Id_click=117; log_Id_view=558; log_Id_pv=378; Hm_lvt_e5ef47b9f471504959267fd614d579cd=1708345516; ssxmod_itna=iqGxuDBD2AKrqGHqaWxUhGQqi=Z+xeDk+Dmg04GNpUYDZDiqAPGhDC38FmBm0jwIdf4804GCi2bqxaAS77gpAIKz2mpYD74i8DCqi1D0qDY+oxBLrbQoxiiyDCmFDPrKD32xlIzDvxG=D3qDFYqDLDMNDFqG0l+QPD0Pq+mDlD73DUwdDQqDSUQKKxGjDxitRDGADx0tUD7jD2eQDeMpTcqGW0wD2zBh8YYaSR=y4cjTiP6WW5cWU7ZnCaONVemQDbRLHweXfxQ0CNODvmAvzSGPqWDhw30Gc+7xeU+1SwA/riBqeRf+3YDDGbxeA4bix4D=; ssxmod_itna2=iqGxuDBD2AKrqGHqaWxUhGQqi=Z+xeDk+DmgDA6WmhxD/Q1DFr21/4Pgp7KAPPuKOBaiGcS0MH0QPvcM+RwKb9uvohWzcgyAOYiPq2NDgQdjoj/l8LYmsENvW2Ax0MEGC2eVUWk53KncOecGsZ7RYU2PWXW0TcABidaTWCDo1XCnrdYPFIAGIzgqHPYSN48+zk+p4IAErxE0tHrOIFokWcFdKt+o87hYQ1YYzhbIsEf7d78O0u7iAADObNMGIfZE+/YBr3mnIRK3Uiqwa4IguIDw1cfH9iTGTT6qgO8KXrczLohsiEZ+2GvH2A3Z4uGvAeelAYMODZnqKfmznxYzIOnx4l3eZP7hwuAdMl6Yh=Q35NSpa=IrK6bS50OwVl5=O5Mjt72PSnrsB5u7482pFPPlWtplXa6ihBrMiPwFq7CeeaXdU6G0n2Z2KjWAhaMeRvKNNRG8jfFcqqn0OYRn8IzOQbt0O9quQD1PeXWPY1=MEpu1KjSUg2I4roD8NDL=E0ePVyofFig7Hm4DQI4zi59hpwnjtqkUCIpcYBC+0hlnlyUhQ0q4UGzKdf6o=K0YD08DiQ4YD===; tfstk=eoFMjtVQpLYXhdd7Jlh19IAf0TBKBhGjcodxDjnVLDoC5S3OC-00xlvtHR3TKxqQAlHVfc30nDqLBIhaiJO0PrNO5jQs1PGjggIRwI4_5jNZPRglwtKlqssR2_CdQelcjgetASaRYOsdBv5aaND-qUxFULPB77oz7czTWW8IQdatxIR4tJDS4imHgIPnSglyLLoTlIgFkWJXhAuI-0QMz_Nc-ETuCwbHFOMZRVIR-wvX8AuI-kQh-LTKQ2gtp; c_dl_um=-; UserName=limeigui; UserInfo=3b95b21938904a148617bb63e4cd8b47; UserToken=3b95b21938904a148617bb63e4cd8b47; UserNick=Adunn; AU=95B; BT=1709514903656; p_uid=U010000; Hm_up_6bcd52f51e9b3dce32bec4a3997715ac=%7B%22islogin%22%3A%7B%22value%22%3A%221%22%2C%22scope%22%3A1%7D%2C%22isonline%22%3A%7B%22value%22%3A%221%22%2C%22scope%22%3A1%7D%2C%22isvip%22%3A%7B%22value%22%3A%220%22%2C%22scope%22%3A1%7D%2C%22uid_%22%3A%7B%22value%22%3A%22limeigui%22%2C%22scope%22%3A1%7D%7D; FCNEC=%5B%5B%22AKsRol-81BVROqAOt-Krga723o1zn0lKQZXCHVYOsZGp4bbSYqIORsTWdRA33h_JQeCm1pUeYYkPLifSrDfR3ebNunu-COvYV2D4sqzbPrXD_tVp9je9p1aG1qgVGpkYlxpNK3mEnUaabXB6IvFt7xBeYqz_845gCA%3D%3D%22%5D%5D; c_dl_fref=https://www.baidu.com/link; c_dl_fpage=/download/qq_27308505/21132392; c_dl_prid=1711094139943_541172; c_dl_rid=1711094192991_543576; limeiguicomment_new=1706325449636; management_ques=1712733666636; c_segment=0; Hm_lvt_6bcd52f51e9b3dce32bec4a3997715ac=1712450998,1712540146,1712628949,1713335190; _clck=v3yta0%7C2%7Cfl2%7C0%7C1559; log_Id_click=118; c_pref=default; c_first_ref=default; dc_sid=9a3d776b6aa46403c7fa216f2ec8d588; creative_btn_mp=3; fpv=d7fa221d14d24b2361ad4f09bace739a; yd_captcha_token=dzp; dc_session_id=10_1713523054888.848382; c_ref=default; log_Id_pv=379; log_Id_view=559; __gads=ID=79df0b17ce2ed235-22263755acb40040:T=1687695664:RT=1713524387:S=ALNI_MbKkmtHbLa1eh1RSXbuJOoatVdiiQ; __gpi=UID=00000c6ade25eb9f:T=1687695664:RT=1713524387:S=ALNI_MZP0oVAi-DQb_-PpFvwoGO0EYhHiQ; __eoi=ID=4a7618f393a07404:T=1706249283:RT=1713524387:S=AA-AfjbTLqbpP5gZ44TgXYWkx20B; _clsk=1k295vw%7C1713524388493%7C4%7C0%7Cn.clarity.ms%2Fcollect; SidecHatdocDescBoxNum=true; c_first_page=https%3A//blog.csdn.net/BullKing8185%3Ftype%3Dblog; Hm_lpvt_6bcd52f51e9b3dce32bec4a3997715ac=1713525601; waf_captcha_marker=2a99ea8a0f9ac96620d0ca1ea34fe2e494a64e498e223e9b42719bcf5b4b099d; c_dsid=11_1713525996492.411629; c_page_id=default; dc_tos=sc6t4c',
'Referer': 'https://blog.csdn.net/BullKing8185?type=blog',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-origin',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Linux"',
}
params = {
'page': '1',
'size': '20',
'businessType': 'blog',
'orderby': '',
'noMore': 'false',
'year': '',
'month': '',
'username': 'BullKing8185',
}
response = requests.get(
'https://blog.csdn.net/community/home-api/v1/get-business-list',
params=params,
cookies=cookies,
headers=headers,
)
print(f'response= {response}')
print(f'response.text= {response.text}')
print(f'response.json= {response.json()}')
json_content = response.json()
print(f'json_content.code = {json_content["code"]}')
print(f'json_content.message = {json_content["message"]}')
print(f'json_content.traceId = {json_content["traceId"]}')
print(f'json_content.data.total = {json_content["data"]["total"]}')
for item in json_content["data"]["list"]:
print(f'阅读量 = {item["viewCount"]}, '
f'收藏量 = {item["diggCount"]}, '
f'{item["title"]}')
运行结果: