python_BeautifulSoup爬取汽车评论数据

news2025/1/23 9:24:30

爬取的网站：

完整代码在文章末尾

https://koubei.16888.com/57233/0-0-0-2

使用方法：

from bs4 import BeautifulSoup

拿到html后使用find_all()拿到文本数据，下图可见，数据标签为：

content_text = soup.find_all('span', class_='show_dp f_r')

因为优点，缺点，综述的classname一样，所以写了个小分类：

   for index,x in enumerate(content_text):
        if index % 3 == 0:
            with open("car_post.txt", "a", encoding='utf-8') as f:
                f.write(x.text+"\n")
        elif index % 3 == 1:
            with open("car_nev.txt", "a", encoding='utf-8') as f:
                f.write(x.text+"\n")
        else:
            with open("car_text.txt", "a", encoding='utf-8') as f:
                f.write(x.text+"\n")

结果预览

消极：

积极：

综述：

完整代码

from bs4 import BeautifulSoup
import requests
for j in range(1,300):
    url="https://koubei.16888.com/57233/0-0-0-{}".format(j)
    headers={
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35"
    }
    resp=requests.get(url,headers=headers)
    resp.encoding="utf-8"
    soup=BeautifulSoup(resp.text,"html.parser")
    content_text = soup.find_all('span', class_='show_dp f_r')

    for index,x in enumerate(content_text):
        if index % 3 == 0:
            with open("car_post.txt", "a", encoding='utf-8') as f:
                f.write(x.text+"\n")
        elif index % 3 == 1:
            with open("car_nev.txt", "a", encoding='utf-8') as f:
                f.write(x.text+"\n")
        else:
            with open("car_text.txt", "a", encoding='utf-8') as f:
                f.write(x.text+"\n")
    print(j)

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/1538083.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！