深入理解Scrapy

news2024/11/24 5:43:51


Scrapy是什么

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Scrapy是适用于Python的一个快速、简单、功能强大的web爬虫框架,通常用于抓取web站点并从页面中提取结构化的数据,也可以用来做监控与自动化测试。架构图如下所示:

640?wx_fmt=png&tp=wxpic&wxfrom=5&wx_lazy=1&wx_co=1

Scrapy如何工作

理解工作原理更有益于后面的学习(也可先看后面的快速上手后再返回来看这里),Scrapy运行流程图如下所示:

640?wx_fmt=png&tp=wxpic&wxfrom=5&wx_lazy=1&wx_co=1

运行过程如下:

  1. 程序启动后将会创建一个/多个Spiders(爬虫)Spiders会将Requests(请求)经过SpiderMiddlewares(爬虫中间件)加工,再交给Engine(引擎)

  2. EngineSpiders传递过来多个请求转交给Scheduler(调度器),由调度器来安排请求。

  3. Scheduler将需要马上执行的请求交回给Engine

  4. Engine将请求经过DownloaderMiddlewares(下载器中间件)加工,再发送给Downloader(下载器)

  5. Downloader使用Requests完成页面/接口的下载,并生成Responses(响应), 将Responses经过DownloaderMiddlewares再转交给Engine

  6. EngineResponses经过SpiderMiddlewares交回给爬虫处理Responses

  7. Spiders处理Responses后产生的结果返回给Engine。(Spiders处理 Responses

  8. 步骤7 中Spiders处理结果返回的Requests对象将回到步骤2;返回的Items(数据结构化对象)或者dict(字典对象)将交给ItemPipelines(数据管道)处理。

  9. 通过定制ItemPipelines来控制数据如何持久化及处理。

开始使用Scrapy

1.安装Scrapy

通过如下命令安装Scrapy

pip install scrapy

Scrapy安装完成会提供一个scrapy工具, 通过命令scrapy --help显示如下则表示安装成功:

> scrapy --help
Scrapy 2.6.2 - no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  bench         Run quick benchmark test
  commands
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

  [ more ]      More commands available when run from project directory

Use "scrapy <command> -h" to see more info about a command

2.创建scrapy项目

通过命令scrapy startproject xxx创建一个Scrapy项目:

scrapy startproject MySpider

命令执行之后在当前目录下会生成一个MySpider的目录,目录结构如下所示:

MySpider/
├─scrapy.cfg
└─MySpider/
  ├─items.py
  ├─middlewares.py
  ├─pipelines.py
  ├─settings.py
  ├─__init__.py
  └─spiders/
    └─__init__.py
  • items.py文件存放自定义的Items

  • middlewares.py 文件存放SpiderMiddlewaresDownloaderMiddlewares

  • pipelines.py 文件存放自定义的ItemPipelines

  • settings.py 文件存放全局的配置信息

  • spiders/ 目录存放所有Spiders

之后以第一个MySpider/目录作为项目根目录

3.创建Scrapy爬虫

创建Scrapy爬虫命令scrapy genspider [spidername] [allow_domain] 以慢慢买历史价格接口为例,创建慢慢买爬虫:

scrapy genspider manmanbuy manmanbuy.com

慢慢买历史价格爬取流程:

  1. 访问 http://tool.manmanbuy.com/HistoryLowest.aspx 页面获取隐藏的 <input id="ticket" ...> 标签的 value 值。

  2. 通过 步骤1 获取的 value 值, 加工生成请求头的 Authorization 参数

  3. 生成 请求参数 token 的值

  4. 调用 http://tool.manmanbuy.com/api.ashx 接口获取商品历史价格 (该接口依赖有效cookie, 如何获取有效cookie不是本文重点暂不说明)

此时在spiders/目录下就能找到生成的manmanbuy.py文件,文件内容如下:

import scrapy


class ManmanbuySpider(scrapy.Spider):
    name = 'manmanbuy'
    allowed_domains = ['manmanbuy.com']
    start_urls = ['http://manmanbuy.com/']

    def parse(self, response):
        pass

其中name为爬虫名称, allowed_domains 为允许访问的域名, start_urls 为启动爬取的地址。

Spider 支持两种启动爬取方式, 一种为便捷的配置start_urls方式, 启动后将直接爬取配置的url, 另一种为重写start_requests方法,返回自定义初始化的Request

4.Scrapy爬虫开发

4.1. 编写Spiders(MySpider/spiders/manmanbuy.py)

import scrapy
from urllib.parse import quote
import hashlib
import time
import copy


class ManmanbuySpider(scrapy.Spider):
    name = 'manmanbuy'
    allowed_domains = ['manmanbuy.com']

    def start_requests(self):
        # 以京东单个商品查询历史价格为例, 商品ID: 100011493273
        item_urls = ['https://item.jd.com/100011493273.html']
        # 定义请求头
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
        }
        # 第一步先从h5页面获取ticket参数
        for item_url in item_urls:
            yield scrapy.Request(url='http://tool.manmanbuy.com/HistoryLowest.aspx?url=' + item_url,
                                 headers=headers,
                                 # 透传参数
                                 meta={'key': item_url})

    def parse(self, response: scrapy.http.Response):
        # 从页面中获取ticket值
        ticket = response.css('#ticket')[0].attrib['value']
        # 获取下一段接口请求参数
        req = parse_req({'key': response.meta['key'], 'method': 'getHistoryTrend'})
        # 请求头
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
            # 计算auth
            'Authorization': parse_basic_auth(ticket),
        }
        return scrapy.FormRequest(url='http://tool.manmanbuy.com/api.ashx',
                                  method='POST',
                                  formdata=req,
                                  headers=headers,
                                  cookies=self.get_cookies(),
                                  # 自定义回调地址
                                  callback=self.parse_history_price)

    def get_cookies(self):
        # 省略获取cookie逻辑
        cks = '_ga=GA1.2.604426644.1596510819; ASP.NET_SessionId=bbyuxdftfkcf5mrijdgkmnc5; Hm_lvt_01a310dc95b71311522403c3237671ae=1658906329; Hm_lvt_85f48cee3e51cd48eaba80781b243db3=1658748396,1658906330; _gid=GA1.2.472137414.1658906330; _gat_gtag_UA_145348783_1=1; 60014_mmbuser=VQYJA1IFBTBSVwNdClFVUgdRUQcLUgdXDg1RBgNTAVVUVAZRBQFeAw%3d%3d; Hm_lpvt_85f48cee3e51cd48eaba80781b243db3=1658906625; Hm_lpvt_01a310dc95b71311522403c3237671ae=1658906625'
        cookies = {}
        for one in cks.split(';'):
            k, v = one.strip().split("=")
            cookies[k] = v
        return cookies

    def parse_history_price(self, response: scrapy.http.Response):
        # 输出相应
        self.logger.info(response.text)


def parse_basic_auth(ticket):
    """
    这是解析ticket的值啊,就是上面说的那逻辑
    """
    return 'BasicAuth ' + ticket[:160][-4:] + ticket[:160 - 4]


def parse_req(d):
    """
    这是解析请求,增加t和token参数
    """
    d['t'] = str(int(time.time() * 1000))
    n = copy.deepcopy(d)
    ks = list(n.keys())
    ks.sort()
    ask = 'c5c3f201a8e8fc634d37a766a0299218'
    mask = ask
    for k in ks:
        mask += f'{k}{quote(str(n[k])).replace("/", "%2F")}'
    mask += ask
    mask = mask.upper()
    md5 = hashlib.md5()
    md5.update(mask.encode('utf-8'))
    d['token'] = md5.hexdigest().upper()
    return d

4.2. 修改Settings(MySpider/settings.py)

# robots.txt 文件检查, 默认为: true, 需要改为Flase
ROBOTSTXT_OBEY = False

运行scrapy crawl manmanbuy命令启动爬虫, 观察日志能够正常获取数据:

{"msg":"","code":0,"data":{"haveTrend":1,"changPriceRemark":"降幅5%","runtime":41,"zouShi_test":2,"changePriceCount":14,"spbh":"1|100011493273","spUrl":"https://item.jd.com/100011493273.html","spPic":"http://img13.360buyimg.com/n7/jfs/t1/201578/31/15673/77560/619479ceEd1bde507/c0dab826b71e0b84.jpg","currentPrice":1049.00,"spName":"荣耀Play5T 22.5W超级快充 5000mAh大电池 6.5英寸护眼屏 全网通8GB+128GB极光蓝","lowerDate":"2022-03-08T00:00:00+08:00","lowerPrice":899.00,"bjid":551120462,"zouShi":2,"siteId":1,"siteName":"京东商城","datePrice":"[1621353600000,1199.00,\"\"],[1621440000000,1199.00,\"\"],[1621526400000,1199.00,\"\"],[1621612800000,1199.00,\"\"],[1621699200000,1199.00,\"\"],[1621785600000,1199.00,\"\"],[1621872000000,1199.00,\"\"],[1621958400000,1199.00,\"\"],[1622044800000,1199.00,\"\"],[1622131200000,1199.00,\"\"],[1622217600000,1199.00,\"\"],[1622304000000,1199.00,\"\"],[1622390400000,1199.00,\"\"],[1622476800000,1199.00,\"\"],[1622563200000,1199.00,\"\"],[1622649600000,1199.00,\"\"],[1622736000000,1199.00,\"\"],[1622822400000,1199.00,\"\"],[1622908800000,1199.00,\"\"],[1622995200000,1199.00,\"\"],[1623081600000,1199.00,\"\"],[1623168000000,1199.00,\"\"],[1623254400000,1199.00,\"\"],[1623340800000,1199.00,\"1199元\"],[1623427200000,1199.00,\"\"],[1623513600000,1199.00,\"\"],[1623600000000,1199.00,\"\"],[1623686400000,1199.00,\"\"],[1623772800000,1199.00,\"\"],[1623859200000,1199.00,\"\"],[1623945600000,1099.00,\"购买1件,当前价:1199.00,满减:每满1180减100\"],[1624032000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624118400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624204800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624291200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624377600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624464000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624550400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624636800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624723200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624809600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624896000000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1624982400000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625068800000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625155200000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625241600000,1139.00,\"购买1件,当前价:1199.00,可叠加优惠券2:满750减60\"],[1625328000000,1199.00,\"\"],[1625414400000,1199.00,\"\"],[1625500800000,1189.0,\"京东秒杀价:1189\"],[1625587200000,1199.00,\"\"],[1625673600000,1189.0,\"\"],[1625760000000,1199.0,\"\"],[1625846400000,1189.0,\"\"],[1625932800000,1199.0000,\"\"],[1626019200000,1199.0000,\"\"],[1626105600000,1189.0,\"\"],[1626192000000,1199.0,\"\"],[1626278400000,1189.0,\"\"],[1626364800000,1199.0000,\"\"],[1626451200000,1199.0000,\"\"],[1626537600000,1199.0000,\"\"],[1626624000000,1199.0000,\"\"],[1626710400000,1199.0000,\"\"],[1626796800000,1199.0000,\"\"],[1626883200000,1189.00,\"\"],[1626969600000,1199.0000,\"\"],[1627056000000,1199.0000,\"\"],[1627142400000,1199.0000,\"\"],[1627228800000,1199.0000,\"\"],[1627315200000,1189.00,\"\"],[1627401600000,1199.0000,\"\"],[1627488000000,1189.00,\"1189元\"],[1627574400000,1189.00,\"\"],[1627660800000,1199.00,\"\"],[1627747200000,1179.00,\"1179元\"],[1627833600000,1189.0000,\"\"],[1627920000000,1199.00,\"\"],[1628006400000,1199.00,\"\"],[1628092800000,1189.00,\"\"],[1628179200000,1189.0000,\"\"],[1628265600000,1189.0000,\"\"],[1628352000000,1189.0000,\"\"],[1628438400000,1199.00,\"\"],[1628524800000,1189.00,\"\"],[1628611200000,1199.0,\"\"],[1628697600000,1189.0000,\"\"],[1628784000000,1189.0000,\"\"],[1628870400000,1189.0000,\"\"],[1628956800000,1199.00,\"1199元\"],[1629043200000,1189.0000,\"\"],[1629129600000,1199.00,\"\"],[1629216000000,1189.00,\"\"],[1629302400000,1199.0000,\"\"],[1629388800000,1169.0,\"京东秒杀价:1169\"],[1629475200000,1199.00,\"\"],[1629561600000,1199.00,\"\"],[1629648000000,1199.00,\"\"],[1629734400000,1169.00,\"\"],[1629820800000,1199.0,\"\"],[1629907200000,1189.0,\"京东秒杀价:1189\"],[1629993600000,1199.00,\"\"],[1630080000000,1199.00,\"\"],[1630166400000,1199.00,\"\"],[1630252800000,1199.00,\"\"],[1630339200000,1189.00,\"1189元包邮\"],[1630425600000,1189.00,\"\"],[1630512000000,1175.00,\"购买1件,plus价格1175\"],[1630598400000,1189.00,\"\"],[1630684800000,1199.0,\"\"],[1630771200000,1189.0000,\"\"],[1630857600000,1199.0,\"\"],[1630944000000,1189.0000,\"\"],[1631030400000,1199.0,\"\"],[1631116800000,1099.00,\"购买1件,当前价:1199.00,满减:每满1180减100\"],[1631203200000,1189.0000,\"\"],[1631289600000,1189.00,\"\"],[1631376000000,1199.00,\"\"],[1631462400000,1189.0000,\"\"],[1631548800000,1189.0,\"\"],[1631635200000,1199.0,\"\"],[1631721600000,1189.00,\"\"],[1631808000000,1199.00,\"\"],[1631894400000,1189.0,\"\"],[1631980800000,1199.0,\"\"],[1632067200000,1169.00,\"\"],[1632153600000,1169.00,\"\"],[1632240000000,1169.00,\"\"],[1632326400000,1189.0,\"\"],[1632412800000,1169.0,\"\"],[1632499200000,1199.0,\"\"],[1632585600000,1169.00,\"\"],[1632672000000,1199.00,\"\"],[1632758400000,1169.0,\"\"],[1632844800000,1199.0000,\"\"],[1632931200000,1169.0,\"\"],[1633017600000,1169.0,\"\"],[1633104000000,1169.0,\"\"],[1633190400000,1169.00,\"\"],[1633276800000,1169.0000,\"\"],[1633363200000,1199.00,\"\"],[1633449600000,1199.00,\"\"],[1633536000000,1169.0,\"\"],[1633622400000,1169.0,\"\"],[1633708800000,1169.0,\"京东秒杀价:1169\"],[1633795200000,1169.0,\"\"],[1633881600000,1199.00,\"\"],[1633968000000,1169.00,\"\"],[1634054400000,1199.0,\"\"],[1634140800000,1169.00,\"\"],[1634227200000,1199.0,\"\"],[1634313600000,1199.0,\"\"],[1634400000000,1169.0,\"\"],[1634486400000,1199.0,\"\"],[1634572800000,1169.00,\"\"],[1634659200000,1199.0000,\"\"],[1634745600000,1169.00,\"\"],[1634832000000,1199.0,\"\"],[1634918400000,1199.0,\"\"],[1635004800000,1199.0,\"\"],[1635091200000,1199.0,\"\"],[1635177600000,1199.0,\"\"],[1635264000000,1199.0,\"\"],[1635350400000,1199.0,\"\"],[1635436800000,1199.0,\"\"],[1635523200000,1099.00,\"1099元 \"],[1635609600000,1099.0,\"\"],[1635696000000,1099.00,\"\"],[1635782400000,1099.00,\"\"],[1635868800000,1099.00,\"\"],[1635955200000,1099.00,\"购买1件,plus价格1099\"],[1636041600000,949.00,\"购买1件,当前价:1099.00,可叠加优惠券2:满880减150\"],[1636128000000,1099.00,\"\"],[1636214400000,1199.0,\"\"],[1636300800000,1099.0,\"\"],[1636387200000,1099.0,\"\"],[1636473600000,1099.0,\"\"],[1636560000000,979.00,\"购买1件,当前价:1099.00,满减:每满1080减120\"],[1636646400000,1099.0000,\"\"],[1636732800000,1199.00,\"\"],[1636819200000,1199.00,\"\"],[1636905600000,1199.00,\"\"],[1636992000000,1199.00,\"\"],[1637078400000,1099.00,\"\"],[1637164800000,1099.00,\"\"],[1637251200000,1099.00,\"\"],[1637337600000,1099.00,\"\"],[1637424000000,1099.00,\"1099元\"],[1637510400000,1099.00,\"\"],[1637596800000,1099.0,\"\"],[1637683200000,1099.00,\"\"],[1637769600000,1099.00,\"\"],[1637856000000,1099.00,\"\"],[1637942400000,1099.00,\"\"],[1638028800000,1199.0000,\"\"],[1638115200000,1199.00,\"\"],[1638201600000,1199.00,\"\"],[1638288000000,1099.0,\"\"],[1638374400000,1099.00,\"\"],[1638460800000,1099.00,\"\"],[1638547200000,1199.0,\"\"],[1638633600000,1199.00,\"\"],[1638720000000,1099.0,\"\"],[1638806400000,1099.00,\"\"],[1638892800000,1099.00,\"\"],[1638979200000,1099.00,\"1099元\"],[1639065600000,1099.00,\"\"],[1639152000000,1089.0,\"\"],[1639238400000,1089.0,\"\"],[1639324800000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639411200000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639497600000,1099.00,\"购买1件,当前价:1199.00,满减:满1150减100\"],[1639584000000,1099.00,\"\"],[1639670400000,1099.0,\"\"],[1639756800000,1099.0,\"\"],[1639843200000,1099.0,\"\"],[1639929600000,1099.00,\"1099元\"],[1640016000000,1099.0,\"\"],[1640102400000,1099.0,\"\"],[1640188800000,1099.0,\"\"],[1640275200000,1099.00,\"1099元\"],[1640361600000,1099.0,\"\"],[1640448000000,1069.0,\"\"],[1640534400000,1099.0,\"\"],[1640620800000,1099.0,\"\"],[1640707200000,1099.0,\"\"],[1640793600000,1099.0,\"\"],[1640880000000,1099.0,\"\"],[1640966400000,1099.00,\"1099元\"],[1641052800000,1099.0,\"\"],[1641139200000,1099.0,\"\"],[1641225600000,1099.0,\"\"],[1641312000000,1099.0,\"\"],[1641398400000,1099.0,\"\"],[1641484800000,1099.0,\"\"],[1641571200000,1099.0,\"\"],[1641657600000,1099.00,\"1099元\"],[1641744000000,1099.0,\"\"],[1641830400000,1099.0,\"\"],[1641916800000,1099.0,\"\"],[1642003200000,1099.0,\"\"],[1642089600000,1099.0,\"\"],[1642176000000,1099.0,\"\"],[1642262400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642348800000,1099.00,\"\"],[1642435200000,1099.00,\"\"],[1642521600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642608000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642694400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1642780800000,1099.0,\"\"],[1642867200000,1099.0,\"\"],[1642953600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643040000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643126400000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643212800000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1643299200000,1099.0,\"\"],[1643385600000,1099.0,\"\"],[1643472000000,1099.0,\"\"],[1643558400000,949.00,\"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100\"],[1643644800000,949.00,\"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100\"],[1643731200000,1099.0,\"\"],[1643817600000,1099.0,\"\"],[1643904000000,1099.0,\"\"],[1643990400000,1099.0,\"\"],[1644076800000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644163200000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644249600000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644336000000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1644422400000,1099.00,\"\"],[1644508800000,1099.00,\"1099元\"],[1644595200000,1199.00,\"1199元\"],[1644681600000,1199.00,\"1089元\"],[1644768000000,1039.00,\"购买1件,当前价:1089.00,满减:满1000减50\"],[1644854400000,1039.00,\"购买1件,当前价:1089.00,满减:满1000减50\"],[1644940800000,1099.00,\"1049元\"],[1645027200000,1099.00,\"\"],[1645113600000,1099.00,\"\"],[1645200000000,1099.00,\"\"],[1645286400000,1099.00,\"\"],[1645372800000,1099.00,\"\"],[1645459200000,1099.00,\"\"],[1645545600000,1099.00,\"\"],[1645632000000,1099.00,\"\"],[1645718400000,1099.00,\"1099元\"],[1645804800000,1099.00,\"1069元\"],[1645891200000,1049.00,\"购买1件,当前价:1099.00,满减:满1000减50\"],[1645977600000,1099.00,\"\"],[1646064000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646150400000,1099.00,\"\"],[1646236800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646323200000,1099.00,\"\"],[1646409600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646496000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646582400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646668800000,899.00,\"购买1件,当前价:1099.00,满减:每满1080减200\"],[1646755200000,1099.00,\"\"],[1646841600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1646928000000,1099.0,\"\"],[1647014400000,1099.0,\"\"],[1647100800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647187200000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647273600000,1099.00,\"\"],[1647360000000,1099.00,\"1099元\"],[1647446400000,1049.0,\"购买1件,当前价:1099.0,满减:满1050减50\"],[1647532800000,1099.0,\"\"],[1647619200000,1099.00,\"\"],[1647705600000,1099.00,\"\"],[1647792000000,1099.00,\"\"],[1647878400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1647964800000,1099.00,\"\"],[1648051200000,1099.00,\"\"],[1648137600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648224000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648310400000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648396800000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648483200000,1039.00,\"购买1件,当前价:1089.00,满减:满1050减50\"],[1648569600000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648656000000,1049.00,\"购买1件,当前价:1099.00,满减:满1050减50\"],[1648742400000,1049.00,\"\"],[1648828800000,1049.00,\"\"],[1648915200000,1049.00,\"1049元\"],[1649001600000,1099.00,\"\"],[1649088000000,1049.00,\"\"],[1649174400000,1049.00,\"\"],[1649260800000,1049.00,\"\"],[1649347200000,1049.00,\"\"],[1649433600000,1099.00,\"\"],[1649520000000,1099.00,\"\"],[1649606400000,1099.00,\"\"],[1649692800000,1049.0,\"购买1件,当前价格1049\"],[1649779200000,1099.00,\"\"],[1649865600000,1099.00,\"\"],[1649952000000,1049.00,\"1049元\"],[1650038400000,1099.00,\"\"],[1650124800000,1049.00,\"\"],[1650211200000,1049.00,\"\"],[1650297600000,1049.00,\"\"],[1650384000000,1049.00,\"\"],[1650470400000,1099.00,\"1099元\"],[1650556800000,1099.00,\"\"],[1650643200000,1099.00,\"\"],[1650729600000,1099.00,\"\"],[1650816000000,1099.00,\"\"],[1650902400000,1099.00,\"\"],[1650988800000,1099.00,\"\"],[1651075200000,1099.00,\"1099元\"],[1651161600000,1099.00,\"\"],[1651248000000,1099.00,\"\"],[1651334400000,1049.00,\"\"],[1651420800000,1099.00,\"\"],[1651507200000,1099.0000,\"\"],[1651593600000,1099.0000,\"\"],[1651680000000,1099.00,\"1099元\"],[1651766400000,1089.0,\"购买1件,当前价格1089\"],[1651852800000,1049.00,\"\"],[1651939200000,1089.00,\"\"],[1652025600000,1099.00,\"1099元\"],[1652112000000,1089.00,\"\"],[1652198400000,1049.00,\"\"],[1652284800000,1089.00,\"\"],[1652371200000,1049.00,\"\"],[1652457600000,1099.00,\"\"],[1652544000000,1099.00,\"\"],[1652630400000,1099.00,\"\"],[1652716800000,1099.00,\"1099元\"],[1652803200000,1099.00,\"\"],[1652889600000,1099.00,\"\"],[1652976000000,1049.00,\"\"],[1653062400000,1099.00,\"\"],[1653148800000,1099.00,\"\"],[1653235200000,1099.00,\"1099元\"],[1653321600000,1099.00,\"\"],[1653408000000,1099.00,\"\"],[1653494400000,1099.00,\"\"],[1653580800000,1099.00,\"\"],[1653667200000,1099.00,\"\"],[1653753600000,1099.00,\"\"],[1653840000000,1099.00,\"\"],[1653926400000,1049.0,\"\"],[1654012800000,1049.00,\"\"],[1654099200000,1049.00,\"\"],[1654185600000,1049.00,\"\"],[1654272000000,1049.00,\"\"],[1654358400000,1049.00,\"\"],[1654444800000,1049.00,\"\"],[1654531200000,1049.00,\"\"],[1654617600000,1049.00,\"\"],[1654704000000,1049.00,\"\"],[1654790400000,1049.00,\"\"],[1654876800000,1049.00,\"\"],[1654963200000,1049.00,\"\"],[1655049600000,1049.00,\"\"],[1655136000000,1049.00,\"\"],[1655222400000,1049.00,\"\"],[1655308800000,1049.00,\"1049元\"],[1655395200000,1049.00,\"\"],[1655481600000,1049.00,\"\"],[1655568000000,1049.00,\"\"],[1655654400000,1049.00,\"\"],[1655740800000,1049.00,\"\"],[1655827200000,1049.00,\"1049元\"],[1655913600000,1049.00,\"\"],[1656000000000,1049.00,\"\"],[1656086400000,1049.00,\"\"],[1656172800000,1049.00,\"1049元\"],[1656259200000,1049.00,\"\"],[1656345600000,1049.00,\"\"],[1656432000000,1049.00,\"\"],[1656518400000,1049.00,\"\"],[1656604800000,1049.00,\"1049元\"],[1656691200000,1049.00,\"\"],[1656777600000,1049.00,\"\"],[1656864000000,1049.00,\"1049元\"],[1656950400000,1049.00,\"\"],[1657036800000,1049.00,\"\"],[1657123200000,1049.00,\"\"],[1657209600000,1049.00,\"\"],[1657296000000,1049.00,\"\"],[1657382400000,1049.00,\"\"],[1657468800000,1049.00,\"\"],[1657555200000,1049.00,\"\"],[1657641600000,1049.00,\"\"],[1657728000000,1049.00,\"\"],[1657814400000,1049.00,\"\"],[1657900800000,1049.00,\"1049元\"],[1657987200000,1049.00,\"\"],[1658073600000,1049.00,\"\"],[1658160000000,1049.00,\"\"],[1658246400000,1049.00,\"\"],[1658332800000,1049.00,\"1049元\"],[1658419200000,1049.00,\"\"],[1658505600000,1049.00,\"\"],[1658592000000,1049.00,\"\"],[1658678400000,1049.00,\"\"],[1658764800000,1049.00,\"\"],[1658851200000,1049.00,\"\"]","ZheKouCount":95},"count":0}

5.Scrapy数据持久化开发

5.1. 编写Items(NySpider/items.py)

import scrapy

class HistoryPriceItem(scrapy.Item):
    """
    自定义历史价格存储Item
    """
    # 商品URL
    itemUrl = scrapy.Field()
    # 图片URL
    picUrl = scrapy.Field()
    # 历史价格信息
    detailPrice = scrapy.Field()

5.2. 编写ItemPipelines(MySpider/pipelines.py), 以文件存储为例:

import scrapy.crawler
from itemadapter import ItemAdapter
from scrapy import signals

class FilePipeline:

    def __init__(self, filename='store.txt'):
        self.filename = filename

    def process_item(self, item, spider):
        # 使用适配器包装item, 防止直接对item进行修改/删除影响后续Pipeline
        adapter = ItemAdapter(item)
        # 写入文件
        self.fp.write(adapter.get('itemUrl') + "    " + adapter.get('picUrl') + "    " + adapter.get('detailPrice') + '\n')
        return item

    @classmethod
    def from_crawler(cls, crawler:scrapy.crawler.Crawler):
        s = cls()
        # 通过信号绑定行为
        # 爬虫启动时创建文件fp
        crawler.signals.connect(s.opened, signal=signals.spider_opened)
        # 爬虫停止时关闭文件fp
        crawler.signals.connect(s.closed, signal=signals.spider_closed)
        return s

    def closed(self, spider):
        self.fp.close()

    def opened(self, spider):
        self.fp = open(self.filename, 'w', encoding='utf-8')
        self.fp.write('商品URL    主图URL    历史价格信息\n')

5.3. 修改Spiders(MySpider/spiders/manmanbuy.py)

import scrapy
import json
from MySpider.items import HistoryPriceItem


class ManmanbuySpider(scrapy.Spider):
     # 省略未修改内容

     custom_settings = {
         # 配置使用的Item管道
        'ITEM_PIPELINES': {
            'MySpider.pipelines.FilePipeline': 300,
        }
    }

    def parse_history_price(self, response: scrapy.http.Response):
        # 解析价格响应
        self.logger.info(response.text)
        data = json.loads(response.text)
        # 返回Item
        return HistoryPriceItem(itemUrl=data['data']['spUrl'], picUrl=data['data']['spPic'], detailPrice=data['data']['datePrice'])

这里说明一下scrapy有5种添加配置方式,常用的有3种,高优先级配置会覆盖低优先级相同的Key的配置,不同的Key的配置则组合起来,按优先级从高到底分别是:

  1. 命令行配置

  2. 爬虫配置

  3. 项目全局配置

Spiders中的custom_settings参数就是爬虫配置

5.4. 运行scrapy crawl manmanbuy命令启动爬虫,观察当前目录发现生成一个store.txt文件,文件内容如下:

商品URL    主图URL    历史价格信息
https://item.jd.com/100011493273.html    http://img13.360buyimg.com/n7/jfs/t1/201578/31/15673/77560/619479ceEd1bde507/c0dab826b71e0b84.jpg    [1621353600000,1199.00,""],[1621440000000,1199.00,""],[1621526400000,1199.00,""],[1621612800000,1199.00,""],[1621699200000,1199.00,""],[1621785600000,1199.00,""],[1621872000000,1199.00,""],[1621958400000,1199.00,""],[1622044800000,1199.00,""],[1622131200000,1199.00,""],[1622217600000,1199.00,""],[1622304000000,1199.00,""],[1622390400000,1199.00,""],[1622476800000,1199.00,""],[1622563200000,1199.00,""],[1622649600000,1199.00,""],[1622736000000,1199.00,""],[1622822400000,1199.00,""],[1622908800000,1199.00,""],[1622995200000,1199.00,""],[1623081600000,1199.00,""],[1623168000000,1199.00,""],[1623254400000,1199.00,""],[1623340800000,1199.00,"1199元"],[1623427200000,1199.00,""],[1623513600000,1199.00,""],[1623600000000,1199.00,""],[1623686400000,1199.00,""],[1623772800000,1199.00,""],[1623859200000,1199.00,""],[1623945600000,1099.00,"购买1件,当前价:1199.00,满减:每满1180减100"],[1624032000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624118400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624204800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624291200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624377600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624464000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624550400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624636800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624723200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624809600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624896000000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1624982400000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625068800000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625155200000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625241600000,1139.00,"购买1件,当前价:1199.00,可叠加优惠券2:满750减60"],[1625328000000,1199.00,""],[1625414400000,1199.00,""],[1625500800000,1189.0,"京东秒杀价:1189"],[1625587200000,1199.00,""],[1625673600000,1189.0,""],[1625760000000,1199.0,""],[1625846400000,1189.0,""],[1625932800000,1199.0000,""],[1626019200000,1199.0000,""],[1626105600000,1189.0,""],[1626192000000,1199.0,""],[1626278400000,1189.0,""],[1626364800000,1199.0000,""],[1626451200000,1199.0000,""],[1626537600000,1199.0000,""],[1626624000000,1199.0000,""],[1626710400000,1199.0000,""],[1626796800000,1199.0000,""],[1626883200000,1189.00,""],[1626969600000,1199.0000,""],[1627056000000,1199.0000,""],[1627142400000,1199.0000,""],[1627228800000,1199.0000,""],[1627315200000,1189.00,""],[1627401600000,1199.0000,""],[1627488000000,1189.00,"1189元"],[1627574400000,1189.00,""],[1627660800000,1199.00,""],[1627747200000,1179.00,"1179元"],[1627833600000,1189.0000,""],[1627920000000,1199.00,""],[1628006400000,1199.00,""],[1628092800000,1189.00,""],[1628179200000,1189.0000,""],[1628265600000,1189.0000,""],[1628352000000,1189.0000,""],[1628438400000,1199.00,""],[1628524800000,1189.00,""],[1628611200000,1199.0,""],[1628697600000,1189.0000,""],[1628784000000,1189.0000,""],[1628870400000,1189.0000,""],[1628956800000,1199.00,"1199元"],[1629043200000,1189.0000,""],[1629129600000,1199.00,""],[1629216000000,1189.00,""],[1629302400000,1199.0000,""],[1629388800000,1169.0,"京东秒杀价:1169"],[1629475200000,1199.00,""],[1629561600000,1199.00,""],[1629648000000,1199.00,""],[1629734400000,1169.00,""],[1629820800000,1199.0,""],[1629907200000,1189.0,"京东秒杀价:1189"],[1629993600000,1199.00,""],[1630080000000,1199.00,""],[1630166400000,1199.00,""],[1630252800000,1199.00,""],[1630339200000,1189.00,"1189元包邮"],[1630425600000,1189.00,""],[1630512000000,1175.00,"购买1件,plus价格1175"],[1630598400000,1189.00,""],[1630684800000,1199.0,""],[1630771200000,1189.0000,""],[1630857600000,1199.0,""],[1630944000000,1189.0000,""],[1631030400000,1199.0,""],[1631116800000,1099.00,"购买1件,当前价:1199.00,满减:每满1180减100"],[1631203200000,1189.0000,""],[1631289600000,1189.00,""],[1631376000000,1199.00,""],[1631462400000,1189.0000,""],[1631548800000,1189.0,""],[1631635200000,1199.0,""],[1631721600000,1189.00,""],[1631808000000,1199.00,""],[1631894400000,1189.0,""],[1631980800000,1199.0,""],[1632067200000,1169.00,""],[1632153600000,1169.00,""],[1632240000000,1169.00,""],[1632326400000,1189.0,""],[1632412800000,1169.0,""],[1632499200000,1199.0,""],[1632585600000,1169.00,""],[1632672000000,1199.00,""],[1632758400000,1169.0,""],[1632844800000,1199.0000,""],[1632931200000,1169.0,""],[1633017600000,1169.0,""],[1633104000000,1169.0,""],[1633190400000,1169.00,""],[1633276800000,1169.0000,""],[1633363200000,1199.00,""],[1633449600000,1199.00,""],[1633536000000,1169.0,""],[1633622400000,1169.0,""],[1633708800000,1169.0,"京东秒杀价:1169"],[1633795200000,1169.0,""],[1633881600000,1199.00,""],[1633968000000,1169.00,""],[1634054400000,1199.0,""],[1634140800000,1169.00,""],[1634227200000,1199.0,""],[1634313600000,1199.0,""],[1634400000000,1169.0,""],[1634486400000,1199.0,""],[1634572800000,1169.00,""],[1634659200000,1199.0000,""],[1634745600000,1169.00,""],[1634832000000,1199.0,""],[1634918400000,1199.0,""],[1635004800000,1199.0,""],[1635091200000,1199.0,""],[1635177600000,1199.0,""],[1635264000000,1199.0,""],[1635350400000,1199.0,""],[1635436800000,1199.0,""],[1635523200000,1099.00,"1099元 "],[1635609600000,1099.0,""],[1635696000000,1099.00,""],[1635782400000,1099.00,""],[1635868800000,1099.00,""],[1635955200000,1099.00,"购买1件,plus价格1099"],[1636041600000,949.00,"购买1件,当前价:1099.00,可叠加优惠券2:满880减150"],[1636128000000,1099.00,""],[1636214400000,1199.0,""],[1636300800000,1099.0,""],[1636387200000,1099.0,""],[1636473600000,1099.0,""],[1636560000000,979.00,"购买1件,当前价:1099.00,满减:每满1080减120"],[1636646400000,1099.0000,""],[1636732800000,1199.00,""],[1636819200000,1199.00,""],[1636905600000,1199.00,""],[1636992000000,1199.00,""],[1637078400000,1099.00,""],[1637164800000,1099.00,""],[1637251200000,1099.00,""],[1637337600000,1099.00,""],[1637424000000,1099.00,"1099元"],[1637510400000,1099.00,""],[1637596800000,1099.0,""],[1637683200000,1099.00,""],[1637769600000,1099.00,""],[1637856000000,1099.00,""],[1637942400000,1099.00,""],[1638028800000,1199.0000,""],[1638115200000,1199.00,""],[1638201600000,1199.00,""],[1638288000000,1099.0,""],[1638374400000,1099.00,""],[1638460800000,1099.00,""],[1638547200000,1199.0,""],[1638633600000,1199.00,""],[1638720000000,1099.0,""],[1638806400000,1099.00,""],[1638892800000,1099.00,""],[1638979200000,1099.00,"1099元"],[1639065600000,1099.00,""],[1639152000000,1089.0,""],[1639238400000,1089.0,""],[1639324800000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639411200000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639497600000,1099.00,"购买1件,当前价:1199.00,满减:满1150减100"],[1639584000000,1099.00,""],[1639670400000,1099.0,""],[1639756800000,1099.0,""],[1639843200000,1099.0,""],[1639929600000,1099.00,"1099元"],[1640016000000,1099.0,""],[1640102400000,1099.0,""],[1640188800000,1099.0,""],[1640275200000,1099.00,"1099元"],[1640361600000,1099.0,""],[1640448000000,1069.0,""],[1640534400000,1099.0,""],[1640620800000,1099.0,""],[1640707200000,1099.0,""],[1640793600000,1099.0,""],[1640880000000,1099.0,""],[1640966400000,1099.00,"1099元"],[1641052800000,1099.0,""],[1641139200000,1099.0,""],[1641225600000,1099.0,""],[1641312000000,1099.0,""],[1641398400000,1099.0,""],[1641484800000,1099.0,""],[1641571200000,1099.0,""],[1641657600000,1099.00,"1099元"],[1641744000000,1099.0,""],[1641830400000,1099.0,""],[1641916800000,1099.0,""],[1642003200000,1099.0,""],[1642089600000,1099.0,""],[1642176000000,1099.0,""],[1642262400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642348800000,1099.00,""],[1642435200000,1099.00,""],[1642521600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642608000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642694400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1642780800000,1099.0,""],[1642867200000,1099.0,""],[1642953600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643040000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643126400000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643212800000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1643299200000,1099.0,""],[1643385600000,1099.0,""],[1643472000000,1099.0,""],[1643558400000,949.00,"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100"],[1643644800000,949.00,"购买1件,当前价:1099.00,满减:满1000减50,可叠加优惠券2:满880减100"],[1643731200000,1099.0,""],[1643817600000,1099.0,""],[1643904000000,1099.0,""],[1643990400000,1099.0,""],[1644076800000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644163200000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644249600000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644336000000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1644422400000,1099.00,""],[1644508800000,1099.00,"1099元"],[1644595200000,1199.00,"1199元"],[1644681600000,1199.00,"1089元"],[1644768000000,1039.00,"购买1件,当前价:1089.00,满减:满1000减50"],[1644854400000,1039.00,"购买1件,当前价:1089.00,满减:满1000减50"],[1644940800000,1099.00,"1049元"],[1645027200000,1099.00,""],[1645113600000,1099.00,""],[1645200000000,1099.00,""],[1645286400000,1099.00,""],[1645372800000,1099.00,""],[1645459200000,1099.00,""],[1645545600000,1099.00,""],[1645632000000,1099.00,""],[1645718400000,1099.00,"1099元"],[1645804800000,1099.00,"1069元"],[1645891200000,1049.00,"购买1件,当前价:1099.00,满减:满1000减50"],[1645977600000,1099.00,""],[1646064000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646150400000,1099.00,""],[1646236800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646323200000,1099.00,""],[1646409600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646496000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646582400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646668800000,899.00,"购买1件,当前价:1099.00,满减:每满1080减200"],[1646755200000,1099.00,""],[1646841600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1646928000000,1099.0,""],[1647014400000,1099.0,""],[1647100800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647187200000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647273600000,1099.00,""],[1647360000000,1099.00,"1099元"],[1647446400000,1049.0,"购买1件,当前价:1099.0,满减:满1050减50"],[1647532800000,1099.0,""],[1647619200000,1099.00,""],[1647705600000,1099.00,""],[1647792000000,1099.00,""],[1647878400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1647964800000,1099.00,""],[1648051200000,1099.00,""],[1648137600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648224000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648310400000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648396800000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648483200000,1039.00,"购买1件,当前价:1089.00,满减:满1050减50"],[1648569600000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648656000000,1049.00,"购买1件,当前价:1099.00,满减:满1050减50"],[1648742400000,1049.00,""],[1648828800000,1049.00,""],[1648915200000,1049.00,"1049元"],[1649001600000,1099.00,""],[1649088000000,1049.00,""],[1649174400000,1049.00,""],[1649260800000,1049.00,""],[1649347200000,1049.00,""],[1649433600000,1099.00,""],[1649520000000,1099.00,""],[1649606400000,1099.00,""],[1649692800000,1049.0,"购买1件,当前价格1049"],[1649779200000,1099.00,""],[1649865600000,1099.00,""],[1649952000000,1049.00,"1049元"],[1650038400000,1099.00,""],[1650124800000,1049.00,""],[1650211200000,1049.00,""],[1650297600000,1049.00,""],[1650384000000,1049.00,""],[1650470400000,1099.00,"1099元"],[1650556800000,1099.00,""],[1650643200000,1099.00,""],[1650729600000,1099.00,""],[1650816000000,1099.00,""],[1650902400000,1099.00,""],[1650988800000,1099.00,""],[1651075200000,1099.00,"1099元"],[1651161600000,1099.00,""],[1651248000000,1099.00,""],[1651334400000,1049.00,""],[1651420800000,1099.00,""],[1651507200000,1099.0000,""],[1651593600000,1099.0000,""],[1651680000000,1099.00,"1099元"],[1651766400000,1089.0,"购买1件,当前价格1089"],[1651852800000,1049.00,""],[1651939200000,1089.00,""],[1652025600000,1099.00,"1099元"],[1652112000000,1089.00,""],[1652198400000,1049.00,""],[1652284800000,1089.00,""],[1652371200000,1049.00,""],[1652457600000,1099.00,""],[1652544000000,1099.00,""],[1652630400000,1099.00,""],[1652716800000,1099.00,"1099元"],[1652803200000,1099.00,""],[1652889600000,1099.00,""],[1652976000000,1049.00,""],[1653062400000,1099.00,""],[1653148800000,1099.00,""],[1653235200000,1099.00,"1099元"],[1653321600000,1099.00,""],[1653408000000,1099.00,""],[1653494400000,1099.00,""],[1653580800000,1099.00,""],[1653667200000,1099.00,""],[1653753600000,1099.00,""],[1653840000000,1099.00,""],[1653926400000,1049.0,""],[1654012800000,1049.00,""],[1654099200000,1049.00,""],[1654185600000,1049.00,""],[1654272000000,1049.00,""],[1654358400000,1049.00,""],[1654444800000,1049.00,""],[1654531200000,1049.00,""],[1654617600000,1049.00,""],[1654704000000,1049.00,""],[1654790400000,1049.00,""],[1654876800000,1049.00,""],[1654963200000,1049.00,""],[1655049600000,1049.00,""],[1655136000000,1049.00,""],[1655222400000,1049.00,""],[1655308800000,1049.00,"1049元"],[1655395200000,1049.00,""],[1655481600000,1049.00,""],[1655568000000,1049.00,""],[1655654400000,1049.00,""],[1655740800000,1049.00,""],[1655827200000,1049.00,"1049元"],[1655913600000,1049.00,""],[1656000000000,1049.00,""],[1656086400000,1049.00,""],[1656172800000,1049.00,"1049元"],[1656259200000,1049.00,""],[1656345600000,1049.00,""],[1656432000000,1049.00,""],[1656518400000,1049.00,""],[1656604800000,1049.00,"1049元"],[1656691200000,1049.00,""],[1656777600000,1049.00,""],[1656864000000,1049.00,"1049元"],[1656950400000,1049.00,""],[1657036800000,1049.00,""],[1657123200000,1049.00,""],[1657209600000,1049.00,""],[1657296000000,1049.00,""],[1657382400000,1049.00,""],[1657468800000,1049.00,""],[1657555200000,1049.00,""],[1657641600000,1049.00,""],[1657728000000,1049.00,""],[1657814400000,1049.00,""],[1657900800000,1049.00,"1049元"],[1657987200000,1049.00,""],[1658073600000,1049.00,""],[1658160000000,1049.00,""],[1658246400000,1049.00,""],[1658332800000,1049.00,"1049元"],[1658419200000,1049.00,""],[1658505600000,1049.00,""],[1658592000000,1049.00,""],[1658678400000,1049.00,""],[1658764800000,1049.00,""],[1658851200000,1049.00,""]

则说明程序执行正常

PS: 若想要将数据持久化至Mysql/MongoDB/Elasticsearch,只需编写对应的ItemPipelines实现, 修改爬虫导入的ITEM_PIPELINES配置即可,实现数据持久化与爬虫逻辑的解耦。

结语

本文为大家简要说明了使用Scrapy的理由,以及通过一个按理为大家演示了如何开发一个Scrapy爬虫项目。后续将持续为大家带来Scrapy更多。

如果大家觉得文章还不错的话,欢迎大家三连(点赞+在看+收藏)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1098097.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

开源六轴机械臂myCobot 280末端执行器实用案例解析

Intrduction 大家好&#xff0c;今天这篇文章的主要内容是讲解以及使用一些myCobot 280 的配件&#xff0c;来了解这些末端执行器都能够完成哪些功能&#xff0c;从而帮助大家能够正确的选择一款适合的配件来进行使用。 本文中主要介绍4款常用的机械臂的末端执行器。 Product m…

安防监控系统EasyCVR视频汇聚平台设备树收藏按钮的细节优化

视频监控TSINGSEE青犀视频平台EasyCVR能在复杂的网络环境中&#xff0c;将分散的各类视频资源进行统一汇聚、整合、集中管理&#xff0c;在视频监控播放上&#xff0c;TSINGSEE青犀视频安防监控汇聚平台可支持1、4、9、16个画面窗口播放&#xff0c;可同时播放多路视频流&#…

使用allure如何生成自动化测试报告 ?一文详解allure的使用 。

网上介绍allure报告的很多 &#xff0c;但个人总感觉还是不够整体 &#xff0c;不够详细 &#xff0c;所看到的都是局部 。故本人花了些时间 &#xff0c;将这个allure详细的整理了一遍 。整体且涉及每个细节 。 1.allure介绍 它是一个生成HTML测试报告的工具包 使用java开发…

14.6 Socket 应用结构体传输

当在套接字编程中传输结构体时&#xff0c;可以将结构体序列化为字符串&#xff08;即把结构体的所有成员打包成一个字符串&#xff09;&#xff0c;然后将字符串通过套接字传输到对端&#xff0c;接收方可以将字符串解析为结构体&#xff0c;然后使用其中的成员数据。这种方法…

如何提高企业工作微信的管理效率?

微信作为一款拥有数亿用户的软件&#xff0c;其使用频率在全国范围内居高不下。随着企业的不断发展&#xff0c;微信在工作中的应用也变得越来越广泛。为了更好地服务客户并提升业务效益&#xff0c;企业通常会为新入职员工配置工作微信以便于业务沟通和客户服务。然而&#xf…

推荐系统离线评估方法和评估指标,以及在推荐服务器内部实现A/B测试和解决A/B测试资源紧张的方法。还介绍了如何在TensorFlow中进行模型离线评估实践。

文章目录 &#x1f31f; 离线评估&#xff1a;常用的推荐系统离线评估方法有哪些&#xff1f;&#x1f34a; 1. RMSE/MSE&#x1f34a; 2. MAE&#x1f34a; 3. Precision/Recall/F1-score&#x1f34a; 4. Coverage&#x1f34a; 5. Personalization&#x1f34a; 6. AUC &…

ChatGPT当导购员!全球最大超市,全面应用生成式AI

全球最大连锁超市沃尔玛&#xff08;Walmart&#xff09;在官网宣布&#xff0c;将在电商平台试用3款生成式AI&#xff0c;帮助用户改善购物体验提升效率。 据悉&#xff0c;沃尔玛使用了一种类ChatGPT的产品&#xff0c;可根据文本提示自动生成购物建议、搜索建议和评论摘要等…

客流人数管理新趋势:景区客流采集分析系统的功能特点

随着旅游业的蓬勃发展&#xff0c;越来越多的人选择前往景区进行休闲和旅游。然而&#xff0c;人流量的增加也给景区管理带来了一系列的挑战。为了更好地管理和运营景区&#xff0c;景区客流采集分析系统应运而生。 一、案例展示 二、产品卖点 该系统利用先进的人工智能算法和…

今天面了一个来华为要求月薪23K,明显感觉他背了很多面试题...

最近有朋友去华为面试&#xff0c;面试前后进行了20天左右&#xff0c;包含4轮电话面试、1轮笔试、1轮主管视频面试、1轮hr视频面试。 据他所说&#xff0c;80%的人都会栽在第一轮面试&#xff0c;要不是他面试前做足准备&#xff0c;估计都坚持不完后面几轮面试。 其实&…

IDEA初始配置

1. 详细设置 安装完IDEA之后的简单配置。 1.1 如何打开详细配置界面 1、显示工具栏 2、选择详细配置菜单或按钮 1.2 系统设置 1、默认启动项目配置 启动IDEA时&#xff0c;默认自动打开上次开发的项目&#xff1f;还是自己选择&#xff1f; 如果去掉Reopen projects on …

ABB REM615 REM611 人工智能和机器学习

ABB REM615 REM611 人工智能和机器学习 自从围绕ChatGPT的炒作开始&#xff0c;每个人都在谈论生成性AI。德国人工智能公司Aleph Alpha的ChatGPT、DALL-E或Luminous等系统今天已经能够支持文本写作、编程和设计。 Aleph Alpha在汉诺威工业博览会上更进一步:该公司将与惠普公司…

许战海战略文库|2023,小鹏危矣!蔚小理之江湖点评

摘要&#xff1a;“性价比”与“主流化”之路的竞争关键是产业链整体优势,中国拥有新能源产业链优势的整车企业,只有比亚迪和长城汽车。 1 月 18 日&#xff0c;何小鹏在小鹏汽车内部喊出“如果不破&#xff0c;小鹏只是早死和晚死的区别。要么跟大家一起足够精彩&#xff0c;要…

Go编程:使用 Colly 库下载Reddit网站的图像

概述 Reddit是一个社交新闻网站&#xff0c;用户可以发布各种主题的内容&#xff0c;包括图片。本文将介绍如何使用Go语言和Colly库编写一个简单的爬虫程序&#xff0c;从Reddit网站上下载指定主题的图片&#xff0c;并保存到本地文件夹中。为了避免被目标网站反爬&#xff0c…

过关斩将法:验证输入的用户信息:

输入用户名、密码、邮箱、如果信息录入正确&#xff0c;则提示注册成功&#xff0c;否则生成异常&#xff1a; 要求&#xff1a; 用户名长度为2或3或4密码的长度为6&#xff0c;要求全是数字 提示&#xff1a;可以自行设计isDigital方法&#xff0c;否则排版则乱邮箱中包含和…

Steam余额红锁的原因,及红锁后申诉办法

安全的余额一般是通过充值卡充值获得&#xff0c;再加上交易手续费再转卖给你。一般便宜不到哪去。 但你别以为余额是安全的&#xff0c;就万事大吉了。照样有被红锁的可能性&#xff0c;比如这三种&#xff1a; 1、Steam市场巡查机制&#xff0c;红锁 平台的巡查机制和原理…

【PCIe Byte Enable】

PCIe Byte Enable 及与TPH关系 Byte Enable PCIe Byte Enable 在mem/IO/Cfg TLP中被应用并且在各种不同TLP中的没有区别&#xff0c;PCIe中Byte Enable与AXI中wstrb类似起到mask的作用&#xff0c;但是PCIe不支持request数据全部字节的mask&#xff0c;只支持数据头和尾部各一…

ASEMI解读KBL610整流桥的使用说明及操作指南

编辑-Z KBL610整流桥是一种功率电子元件&#xff0c;它在电力系统、电力电子设备中有着广泛的应用。然而&#xff0c;对于初次接触或者专业人士来说&#xff0c;明确使用说明和操作更是关键。那么&#xff0c;让我们一起来详解KBL610整流桥的使用说明及操作指南。 KBL610整流桥…

灾害与环境遥感团队本科生在IEEE TGRS 发表高水平论文

2023年9月27日&#xff0c;地球科学和遥感领域顶级期刊《IEEE Transactions on Geoscience and Remote Sensing》&#xff08;IEEE TGRS&#xff09;在线预刊发了灾害与环境遥感团队的最新研究成果“A novel spectral index for rapid dust-proof net mapping based on Sentine…

COLE HERSEE 48408 工业4.0、制造业X和元宇宙

COLE HERSEE 48408 工业4.0、制造业X和元宇宙 需要数据来释放工业4.0的全部潜力——价值链中的所有公司都可以访问大量数据。一个新的互联数据生态系统旨在提供解决方案:制造业x。 在德国联邦经济事务和气候行动部以及BDI、VDMA和ZVEI贸易协会的密切合作下&#xff0c;实施制…

性能测试之Mysql数据库调优

一、前言 性能调优前提&#xff1a;无监控不调优&#xff0c;对于mysql性能的监控前几天有文章提到过&#xff0c;有兴趣的朋友可以去看一下 二、Mysql性能指标及问题分析和定位 1、我们在监控图表中关注的性能指标大概有这么几个&#xff1a;CPU、内存、连接数、io读写时间…