【Python爬虫+可视化案例】采集电商网站商品数据信息,并可视化分析

news2024/12/23 15:06:22

爬虫+可视化案例 :苏宁易购

  1. 案例所需要掌握的知识点:
  • selenium的使用
  • html标签数据解析方法
  1. 需要准备的环境:
  • python 3.8
  • pycharm 2022专业版
  • selenium python里面的第三方库 可以用来操作浏览器

爬虫代码展示

所需模块
【代码领取 请看文末名片】

import time
from selenium import webdriver  # 第三方库 操作浏览器驱动的 浏览器驱动用来操作浏览器的
from selenium.webdriver.common.by import By
import csv

新建文件

f = open('苏宁易购.csv', mode='a', encoding='utf-8', newline='')
csv_writer = csv.writer(f)
csv_writer.writerow(['title', 'price', 'comment', 'store_stock', 'href'])
  1. 打开浏览器 是没问题的
driver = webdriver.Chrome()
  1. 打开网站
driver.get("https://****/iPhone14/")
for i in range(15):
  1. 将滚动条 拉到最下方
    # 通过js代码去操作 页面
    # deocument.documentElement.scrollHeight: 获取当前整个页面的高度
    # document.documentElement.scrollTop: 当前滚动条的位置
    # document.documentElement.scrollTop = document.documentElement.scrollHeight: 将当前滚动条的位置设置为 整个页面的高度
    for page in range(0, 14500, 2900):
        driver.execute_script('document.documentElement.scrollTop = ' + str(page))
        time.sleep(1)
  1. 取数据

.product-box: 匹配到所有的商品标签

    goods = driver.find_elements(By.CSS_SELECTOR, ".product-box")
    ""代码获取:文末名片""
    for good in goods:
        price = good.find_element(By.CSS_SELECTOR, ".price-box").text
        title = good.find_element(By.CSS_SELECTOR, ".title-selling-point").text
        href = good.find_element(By.CSS_SELECTOR, ".title-selling-point a").get_attribute("href")
        comment = good.find_element(By.CSS_SELECTOR, ".evaluate-old.clearfix").text
        store_stock = good.find_element(By.CSS_SELECTOR, ".store-stock").text
        print(title, price, comment, store_stock)
        csv_writer.writerow([title, price, comment, store_stock, href])
    driver.find_element(By.CSS_SELECTOR, "#nextPage").click()
# 阻塞 不让程序结束 因为程序结束 浏览器就自动关闭了
# 退出浏览器
driver.quit()

在这里插入图片描述

可视化效果演示

一共就是以下三个表格,外加一个词云图

我觉得如果是大学生把这个交给老师,应该也许大概可能,老师会觉得你还不错吧,哈哈哈

在这里插入图片描述

开个玩笑,不过现在大学生基本都写完作业,早早的放假了

还是希望这篇文章可以帮助到大家吧,emm不过最近没怎么更新,已经没什么人看文章了哈哈

请添加图片描述
请添加图片描述
请添加图片描述
请添加图片描述

可视化代码

{源码领取,请看文末名片
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "d19250a4",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd \n",
    "import jieba\n",
    "import time\n",
    "from pyecharts.charts import Bar,Line,Map,Page,Pie  \n",
    "from pyecharts import options as opts \n",
    "from pyecharts.globals import SymbolType"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "69c29f78",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>price</th>\n",
       "      <th>comment</th>\n",
       "      <th>store_stock</th>\n",
       "      <th>href</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Apple iPhone 14 128G 午夜色 移动联通电信5G手机</td>\n",
       "      <td>¥5999.00</td>\n",
       "      <td>1.3万+评价</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Apple iPhone 14 Pro Max 256G 暗紫色 移动联通电信5G手机</td>\n",
       "      <td>¥9899.00</td>\n",
       "      <td>6300+评价</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Apple iPhone 14 Pro Max 128G 暗紫色 移动联通电信5G手机</td>\n",
       "      <td>¥8999.00</td>\n",
       "      <td>6300+评价</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Apple iPhone 14 Pro 256G 深空黑色 移动联通电信5G手机</td>\n",
       "      <td>¥8899.00</td>\n",
       "      <td>6400+评价</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Apple iPhone 14 Pro 128G 暗紫色 移动联通电信5G手机</td>\n",
       "      <td>¥7999.00</td>\n",
       "      <td>6400+评价</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1693</th>\n",
       "      <td>圣幻 iphone11手机壳苹果11pro硅胶套iphone11PROMAX全包防摔ipho...</td>\n",
       "      <td>¥46.00</td>\n",
       "      <td>200+评价</td>\n",
       "      <td>任意门数码专营店</td>\n",
       "      <td>https://product.suning.com/0070067325/11398343...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1694</th>\n",
       "      <td>圣幻 iphone11苹果11proMax手机壳薄透明苹果11全包边11Pro电镀软壳防摔1...</td>\n",
       "      <td>¥46.00</td>\n",
       "      <td>300+评价</td>\n",
       "      <td>任意门数码专营店</td>\n",
       "      <td>https://product.suning.com/0070067325/11398610...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1695</th>\n",
       "      <td>圣幻 iphone11/12/12pro手机壳透明防摔苹果12ProMAX保护套新款轻薄硅胶...</td>\n",
       "      <td>¥46.00</td>\n",
       "      <td>1800+评价</td>\n",
       "      <td>任意门数码专营店</td>\n",
       "      <td>https://product.suning.com/0070067325/12179125...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1696</th>\n",
       "      <td>VMONN苹果13手机壳新款防摔iphone13Pro max翻盖保护皮套mini钱包插卡</td>\n",
       "      <td>¥48.00</td>\n",
       "      <td>0评价</td>\n",
       "      <td>骑猪漫舞数码配件专营店</td>\n",
       "      <td>https://product.suning.com/0070154072/12321840...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1697</th>\n",
       "      <td>KIVee 可逸 PD20W快充套苹果PD充电器+1米数据线适用于苹果iPhone14/13pro</td>\n",
       "      <td>¥58.00</td>\n",
       "      <td>200+评价</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12395644...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1698 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                  title     price  comment  \\\n",
       "0                   Apple iPhone 14 128G 午夜色 移动联通电信5G手机  ¥5999.00  1.3万+评价   \n",
       "1           Apple iPhone 14 Pro Max 256G 暗紫色 移动联通电信5G手机  ¥9899.00  6300+评价   \n",
       "2           Apple iPhone 14 Pro Max 128G 暗紫色 移动联通电信5G手机  ¥8999.00  6300+评价   \n",
       "3              Apple iPhone 14 Pro 256G 深空黑色 移动联通电信5G手机  ¥8899.00  6400+评价   \n",
       "4               Apple iPhone 14 Pro 128G 暗紫色 移动联通电信5G手机  ¥7999.00  6400+评价   \n",
       "...                                                 ...       ...      ...   \n",
       "1693  圣幻 iphone11手机壳苹果11pro硅胶套iphone11PROMAX全包防摔ipho...    ¥46.00   200+评价   \n",
       "1694  圣幻 iphone11苹果11proMax手机壳薄透明苹果11全包边11Pro电镀软壳防摔1...    ¥46.00   300+评价   \n",
       "1695  圣幻 iphone11/12/12pro手机壳透明防摔苹果12ProMAX保护套新款轻薄硅胶...    ¥46.00  1800+评价   \n",
       "1696      VMONN苹果13手机壳新款防摔iphone13Pro max翻盖保护皮套mini钱包插卡    ¥48.00      0评价   \n",
       "1697  KIVee 可逸 PD20W快充套苹果PD充电器+1米数据线适用于苹果iPhone14/13pro    ¥58.00   200+评价   \n",
       "\n",
       "      store_stock                                               href  \n",
       "0            苏宁自营  https://product.suning.com/0000000000/12391268...  \n",
       "1            苏宁自营  https://product.suning.com/0000000000/12391268...  \n",
       "2            苏宁自营  https://product.suning.com/0000000000/12391268...  \n",
       "3            苏宁自营  https://product.suning.com/0000000000/12391268...  \n",
       "4            苏宁自营  https://product.suning.com/0000000000/12391268...  \n",
       "...           ...                                                ...  \n",
       "1693     任意门数码专营店  https://product.suning.com/0070067325/11398343...  \n",
       "1694     任意门数码专营店  https://product.suning.com/0070067325/11398610...  \n",
       "1695     任意门数码专营店  https://product.suning.com/0070067325/12179125...  \n",
       "1696  骑猪漫舞数码配件专营店  https://product.suning.com/0070154072/12321840...  \n",
       "1697         苏宁自营  https://product.suning.com/0000000000/12395644...  \n",
       "\n",
       "[1698 rows x 5 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_tb = pd.read_csv('苏宁易购.csv')\n",
    "df_tb"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "199ddb66",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 1698 entries, 0 to 1697\n",
      "Data columns (total 5 columns):\n",
      " #   Column       Non-Null Count  Dtype \n",
      "---  ------       --------------  ----- \n",
      " 0   title        1698 non-null   object\n",
      " 1   price        1698 non-null   object\n",
      " 2   comment      1692 non-null   object\n",
      " 3   store_stock  1698 non-null   object\n",
      " 4   href         1698 non-null   object\n",
      "dtypes: object(5)\n",
      "memory usage: 66.5+ KB\n"
     ]
    }
   ],
   "source": [
    "df_tb.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "359af5ec",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "<ipython-input-8-e2aeb333fba4>:3: FutureWarning: The default value of regex will change from True to False in a future version.\n",
      "  df_tb['price'] = df_tb['price'].str.replace('\\n26.90', '')\n",
      "<ipython-input-8-e2aeb333fba4>:4: FutureWarning: The default value of regex will change from True to False in a future version. In addition, single character regular expressions will*not* be treated as literal strings when regex=True.\n",
      "  df_tb['comment'] = df_tb['comment'].str.replace('+', '')\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>price</th>\n",
       "      <th>comment</th>\n",
       "      <th>store_stock</th>\n",
       "      <th>href</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Apple iPhone 14 128G 午夜色 移动联通电信5G手机</td>\n",
       "      <td>5999.0</td>\n",
       "      <td>1.3</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Apple iPhone 14 Pro Max 256G 暗紫色 移动联通电信5G手机</td>\n",
       "      <td>9899.0</td>\n",
       "      <td>6300.0</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Apple iPhone 14 Pro Max 128G 暗紫色 移动联通电信5G手机</td>\n",
       "      <td>8999.0</td>\n",
       "      <td>6300.0</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Apple iPhone 14 Pro 256G 深空黑色 移动联通电信5G手机</td>\n",
       "      <td>8899.0</td>\n",
       "      <td>6400.0</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Apple iPhone 14 Pro 128G 暗紫色 移动联通电信5G手机</td>\n",
       "      <td>7999.0</td>\n",
       "      <td>6400.0</td>\n",
       "      <td>苏宁自营</td>\n",
       "      <td>https://product.suning.com/0000000000/12391268...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                         title   price  comment store_stock  \\\n",
       "0          Apple iPhone 14 128G 午夜色 移动联通电信5G手机  5999.0      1.3        苏宁自营   \n",
       "1  Apple iPhone 14 Pro Max 256G 暗紫色 移动联通电信5G手机  9899.0   6300.0        苏宁自营   \n",
       "2  Apple iPhone 14 Pro Max 128G 暗紫色 移动联通电信5G手机  8999.0   6300.0        苏宁自营   \n",
       "3     Apple iPhone 14 Pro 256G 深空黑色 移动联通电信5G手机  8899.0   6400.0        苏宁自营   \n",
       "4      Apple iPhone 14 Pro 128G 暗紫色 移动联通电信5G手机  7999.0   6400.0        苏宁自营   \n",
       "\n",
       "                                                href  \n",
       "0  https://product.suning.com/0000000000/12391268...  \n",
       "1  https://product.suning.com/0000000000/12391268...  \n",
       "2  https://product.suning.com/0000000000/12391268...  \n",
       "3  https://product.suning.com/0000000000/12391268...  \n",
       "4  https://product.suning.com/0000000000/12391268...  "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_tb['price'] = df_tb['price'].str.replace('¥', '')\n",
    "df_tb['price'] = df_tb['price'].str.replace('到手价', '')\n",
    "df_tb['price'] = df_tb['price'].str.replace('\\n26.90', '')\n",
    "df_tb['comment'] = df_tb['comment'].str.replace('+', '')\n",
    "df_tb['comment'] = df_tb['comment'].str.replace('评价', '')\n",
    "df_tb['comment'] = df_tb['comment'].str.replace('万', '')\n",
    "\n",
    "df_tb['price'] = df_tb['price'].astype('float64')\n",
    "df_tb['comment'] = df_tb['comment'].astype('float64')\n",
    "df_tb.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "e3225a22",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "store_stock\n",
       "Apple产品啟韬专卖店             103802.0\n",
       "任意门数码专营店                   5582.0\n",
       "小米智能生活旗舰店                     1.0\n",
       "小米智能电器旗舰店                     0.0\n",
       "数格尚品数码配件专营店                2417.0\n",
       "深得二手电脑专营店                     0.0\n",
       "直营                            0.0\n",
       "禧运二手靓品专营店                    10.0\n",
       "科华专营店                        84.8\n",
       "竟纬科技专营店                     370.0\n",
       "绿联官方旗舰店                   20620.0\n",
       "绿联数码旗舰店                    8604.8\n",
       "苏宁二手优品授权旗舰店                   0.0\n",
       "苏宁国际\\n3C数码海外专营店             320.0\n",
       "苏宁国际\\n八达通海外专营店             1950.0\n",
       "苏宁国际\\n嘉怡海外专营店             47219.0\n",
       "苏宁国际\\n德天诺海外专营店            17626.0\n",
       "苏宁国际\\n方都数码海外旗舰店           33902.0\n",
       "苏宁国际\\n百思卖海外专营店              225.0\n",
       "苏宁国际\\n黑海数码海外官方旗舰店         11802.0\n",
       "苏宁服务\\nApple智能数码苏宁专卖店    1081901.0\n",
       "苏宁服务\\n小米智能苏宁专卖店               0.0\n",
       "苏宁服务\\n易购优选数码苏宁旗舰店             0.0\n",
       "苏宁服务\\n波格朗苏宁旗舰店               21.0\n",
       "苏宁自营                     161359.6\n",
       "苏宁自营\\n华均魅苏宁旗舰店              416.0\n",
       "讯天国际手机专营店                   480.0\n",
       "诗薇蒂数码专营店                     27.0\n",
       "质点旗舰店                    124720.0\n",
       "迈动智能数码专营店                   300.0\n",
       "锦际数码专营店                    2858.9\n",
       "顺宇数码专营店                    1202.0\n",
       "骑猪漫舞数码配件专营店                   7.0\n",
       "Name: comment, dtype: float64"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_tb.groupby('store_stock')['comment'].sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "583489b0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "store_stock\n",
       "苏宁服务\\nApple智能数码苏宁专卖店    1081901.0\n",
       "苏宁自营                     161359.6\n",
       "质点旗舰店                    124720.0\n",
       "Apple产品啟韬专卖店             103802.0\n",
       "苏宁国际\\n嘉怡海外专营店             47219.0\n",
       "苏宁国际\\n方都数码海外旗舰店           33902.0\n",
       "绿联官方旗舰店                   20620.0\n",
       "苏宁国际\\n德天诺海外专营店            17626.0\n",
       "苏宁国际\\n黑海数码海外官方旗舰店         11802.0\n",
       "绿联数码旗舰店                    8604.8\n",
       "Name: comment, dtype: float64"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "shop_top10 = df_tb.groupby('store_stock')['comment'].sum().sort_values(ascending=False).head(10)\n",
    "shop_top10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "ad50c7b9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'C:\\\\02-讲师文件夹\\\\巳月公开课\\\\课题\\\\苏宁易购\\\\1.html'"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#条形图  \n",
    "#bar1 = Bar(init_opts=opts.InitOpts(width='1350px', height='750px')) \n",
    "bar1 = Bar() \n",
    "bar1.add_xaxis(shop_top10.index.tolist())\n",
    "bar1.add_yaxis('', shop_top10.values.tolist()) \n",
    "bar1.set_global_opts(title_opts=opts.TitleOpts(title='iphone13排名Top10苏宁店铺'),\n",
    "                     xaxis_opts=opts.AxisOpts(axislabel_opts=opts.LabelOpts(rotate=-15)),\n",
    "                     visualmap_opts=opts.VisualMapOpts(max_=28669)\n",
    "                    ) \n",
    "\n",
    "bar1.render('1.html')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "657a7e89",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1000元以上      858\n",
       "0~50元        317\n",
       "50~100元       92\n",
       "100~200元      84\n",
       "200~300元       8\n",
       "300~500元       6\n",
       "500~1000元      2\n",
       "Name: price, dtype: int64"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cut_bins = [0,50,100,200,300,500,1000,8888]  \n",
    "cut_labels = ['0~50元', '50~100元', '100~200元', '200~300元', '300~500元', '500~1000元', '1000元以上']\n",
    "\n",
    "price_cut = pd.cut(df_tb['price'],bins=cut_bins,labels=cut_labels)\n",
    "price_num = price_cut.value_counts()\n",
    "price_num"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "569f1dc3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'C:\\\\02-讲师文件夹\\\\巳月公开课\\\\课题\\\\苏宁易购\\\\2.html'"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "bar3 = Bar() \n",
    "bar3.add_xaxis(['0~50元', '50~100元', '100~200元', '200~300元', '300~500元', '500~1000元', '1000元以上'])\n",
    "bar3.add_yaxis('', [895, 486, 701, 288, 370, 411, 260]) \n",
    "bar3.set_global_opts(title_opts=opts.TitleOpts(title='不同价格区间的商品数量'),\n",
    "                     visualmap_opts=opts.VisualMapOpts(max_=900)) \n",
    "bar3.render('2.html')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "4bfcfcb1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "price_cut\n",
       "0~50元          80778.9\n",
       "50~100元        27471.5\n",
       "100~200元        6371.9\n",
       "200~300元         320.0\n",
       "300~500元         203.0\n",
       "500~1000元       1600.0\n",
       "1000元以上      1118024.2\n",
       "Name: comment, dtype: float64"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_tb['price_cut'] = price_cut \n",
    "\n",
    "cut_purchase = df_tb.groupby('price_cut')['comment'].sum()\n",
    "cut_purchase"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "61b3f3b9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "<script>\n",
       "    require.config({\n",
       "        paths: {\n",
       "            'echarts':'https://assets.pyecharts.org/assets/v5/echarts.min'\n",
       "        }\n",
       "    });\n",
       "</script>\n",
       "\n",
       "        <div id=\"2c59c934b8d84dffbde3cc9ff80a5318\" style=\"width:900px; height:500px;\"></div>\n",
       "\n",
       "<script>\n",
       "        require(['echarts'], function(echarts) {\n",
       "                var chart_2c59c934b8d84dffbde3cc9ff80a5318 = echarts.init(\n",
       "                    document.getElementById('2c59c934b8d84dffbde3cc9ff80a5318'), 'white', {renderer: 'canvas'});\n",
       "                var option_2c59c934b8d84dffbde3cc9ff80a5318 = {\n",
       "    \"animationEasing\": \"cubicOut\",\n",
       "    \"animationDelay\": 0,\n",
       "    \"animationDurationUpdate\": 300,\n",
       "    \"animationEasingUpdate\": \"cubicOut\",\n",
       "    \"animationDelayUpdate\": 0,\n",
       "    \"aria\": {\n",
       "        \"enabled\": false\n",
       "    },\n",
       "    \"color\": [\n",
       "        \"#EF9050\",\n",
       "        \"#3B7BA9\",\n",
       "        \"#6FB27C\",\n",
       "        \"#FFAF34\",\n",
       "        \"#D8BFD8\",\n",
       "        \"#00BFFF\",\n",
       "        \"#7FFFAA\"\n",
       "    ],\n",
       "    \"series\": [\n",
       "        {\n",
       "            \"type\": \"pie\",\n",
       "            \"colorBy\": \"data\",\n",
       "            \"legendHoverLink\": true,\n",
       "            \"selectedMode\": false,\n",
       "            \"selectedOffset\": 10,\n",
       "            \"clockwise\": true,\n",
       "            \"startAngle\": 90,\n",
       "            \"minAngle\": 0,\n",
       "            \"minShowLabelAngle\": 0,\n",
       "            \"avoidLabelOverlap\": true,\n",
       "            \"stillShowZeroSum\": true,\n",
       "            \"percentPrecision\": 2,\n",
       "            \"showEmptyCircle\": true,\n",
       "            \"emptyCircleStyle\": {\n",
       "                \"color\": \"lightgray\",\n",
       "                \"borderColor\": \"#000\",\n",
       "                \"borderWidth\": 0,\n",
       "                \"borderType\": \"solid\",\n",
       "                \"borderDashOffset\": 0,\n",
       "                \"borderCap\": \"butt\",\n",
       "                \"borderJoin\": \"bevel\",\n",
       "                \"borderMiterLimit\": 10,\n",
       "                \"opacity\": 1\n",
       "            },\n",
       "            \"data\": [\n",
       "                {\n",
       "                    \"name\": \"0~50\\u5143\",\n",
       "                    \"value\": 80778.90000000004\n",
       "                },\n",
       "                {\n",
       "                    \"name\": \"50~100\\u5143\",\n",
       "                    \"value\": 27471.5\n",
       "                },\n",
       "                {\n",
       "                    \"name\": \"100~200\\u5143\",\n",
       "                    \"value\": 6371.9\n",
       "                },\n",
       "                {\n",
       "                    \"name\": \"200~300\\u5143\",\n",
       "                    \"value\": 320.0\n",
       "                },\n",
       "                {\n",
       "                    \"name\": \"300~500\\u5143\",\n",
       "                    \"value\": 203.0\n",
       "                },\n",
       "                {\n",
       "                    \"name\": \"500~1000\\u5143\",\n",
       "                    \"value\": 1600.0\n",
       "                },\n",
       "                {\n",
       "                    \"name\": \"1000\\u5143\\u4ee5\\u4e0a\",\n",
       "                    \"value\": 1118024.1999999993\n",
       "                }\n",
       "            ],\n",
       "            \"radius\": [\n",
       "                \"35%\",\n",
       "                \"60%\"\n",
       "            ],\n",
       "            \"center\": [\n",
       "                \"50%\",\n",
       "                \"50%\"\n",
       "            ],\n",
       "            \"label\": {\n",
       "                \"show\": true,\n",
       "                \"margin\": 8,\n",
       "                \"formatter\": \"{b}:{d}%\"\n",
       "            },\n",
       "            \"labelLine\": {\n",
       "                \"show\": true,\n",
       "                \"showAbove\": false,\n",
       "                \"length\": 15,\n",
       "                \"length2\": 15,\n",
       "                \"smooth\": false,\n",
       "                \"minTurnAngle\": 90,\n",
       "                \"maxSurfaceAngle\": 90\n",
       "            },\n",
       "            \"rippleEffect\": {\n",
       "                \"show\": true,\n",
       "                \"brushType\": \"stroke\",\n",
       "                \"scale\": 2.5,\n",
       "                \"period\": 4\n",
       "            }\n",
       "        }\n",
       "    ],\n",
       "    \"legend\": [\n",
       "        {\n",
       "            \"data\": [\n",
       "                \"0~50\\u5143\",\n",
       "                \"50~100\\u5143\",\n",
       "                \"100~200\\u5143\",\n",
       "                \"200~300\\u5143\",\n",
       "                \"300~500\\u5143\",\n",
       "                \"500~1000\\u5143\",\n",
       "                \"1000\\u5143\\u4ee5\\u4e0a\"\n",
       "            ],\n",
       "            \"selected\": {},\n",
       "            \"show\": true,\n",
       "            \"left\": \"2%\",\n",
       "            \"top\": \"15%\",\n",
       "            \"orient\": \"vertical\",\n",
       "            \"padding\": 5,\n",
       "            \"itemGap\": 10,\n",
       "            \"itemWidth\": 25,\n",
       "            \"itemHeight\": 14,\n",
       "            \"backgroundColor\": \"transparent\",\n",
       "            \"borderColor\": \"#ccc\",\n",
       "            \"borderWidth\": 1,\n",
       "            \"borderRadius\": 0,\n",
       "            \"pageButtonItemGap\": 5,\n",
       "            \"pageButtonPosition\": \"end\",\n",
       "            \"pageFormatter\": \"{current}/{total}\",\n",
       "            \"pageIconColor\": \"#2f4554\",\n",
       "            \"pageIconInactiveColor\": \"#aaa\",\n",
       "            \"pageIconSize\": 15,\n",
       "            \"animationDurationUpdate\": 800,\n",
       "            \"selector\": false,\n",
       "            \"selectorPosition\": \"auto\",\n",
       "            \"selectorItemGap\": 7,\n",
       "            \"selectorButtonGap\": 10\n",
       "        }\n",
       "    ],\n",
       "    \"tooltip\": {\n",
       "        \"show\": true,\n",
       "        \"trigger\": \"item\",\n",
       "        \"triggerOn\": \"mousemove|click\",\n",
       "        \"axisPointer\": {\n",
       "            \"type\": \"line\"\n",
       "        },\n",
       "        \"showContent\": true,\n",
       "        \"alwaysShowContent\": false,\n",
       "        \"showDelay\": 0,\n",
       "        \"hideDelay\": 100,\n",
       "        \"enterable\": false,\n",
       "        \"confine\": false,\n",
       "        \"appendToBody\": false,\n",
       "        \"transitionDuration\": 0.4,\n",
       "        \"textStyle\": {\n",
       "            \"fontSize\": 14\n",
       "        },\n",
       "        \"borderWidth\": 0,\n",
       "        \"padding\": 5,\n",
       "        \"order\": \"seriesAsc\"\n",
       "    },\n",
       "    \"title\": [\n",
       "        {\n",
       "            \"show\": true,\n",
       "            \"text\": \"\\u4e0d\\u540c\\u4ef7\\u683c\\u533a\\u95f4\\u7684\\u9500\\u552e\\u989d\\u6574\\u4f53\\u8868\\u73b0\",\n",
       "            \"target\": \"blank\",\n",
       "            \"subtarget\": \"blank\",\n",
       "            \"padding\": 5,\n",
       "            \"itemGap\": 10,\n",
       "            \"textAlign\": \"auto\",\n",
       "            \"textVerticalAlign\": \"auto\",\n",
       "            \"triggerEvent\": false\n",
       "        }\n",
       "    ]\n",
       "};\n",
       "                chart_2c59c934b8d84dffbde3cc9ff80a5318.setOption(option_2c59c934b8d84dffbde3cc9ff80a5318);\n",
       "        });\n",
       "    </script>\n"
      ],
      "text/plain": [
       "<pyecharts.render.display.HTML at 0x28981696340>"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_pair = [list(z) for z in zip(cut_purchase.index.tolist(), cut_purchase.values.tolist())]\n",
    "# 绘制饼图\n",
    "pie1 = Pie() \n",
    "pie1.add('', data_pair, radius=['35%', '60%'])\n",
    "pie1.set_global_opts(title_opts=opts.TitleOpts(title='不同价格区间的销售额整体表现'), \n",
    "                     legend_opts=opts.LegendOpts(orient='vertical', pos_top='15%', pos_left='2%'))\n",
    "pie1.set_series_opts(label_opts=opts.LabelOpts(formatter=\"{b}:{d}%\"))\n",
    "pie1.set_colors(['#EF9050', '#3B7BA9', '#6FB27C', '#FFAF34', '#D8BFD8', '#00BFFF', '#7FFFAA'])\n",
    "pie1.render_notebook() "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "11f85a85",
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_cut_words(content_series):\n",
    "    # 读入停用词表\n",
    "    stop_words = [] \n",
    "    \n",
    "    # with open(r\"E:\\py练习\\数据分析\\stop_words.txt\", 'r', encoding='utf-8') as f:\n",
    "    #     lines = f.readlines()\n",
    "    #     for line in lines:\n",
    "    #         stop_words.append(line.strip())\n",
    "\n",
    "    # 添加关键词\n",
    "    my_words = ['丝袜', '夏天', '女薄款', '一体'] \n",
    "    for i in my_words:\n",
    "        jieba.add_word(i) \n",
    "\n",
    "    # 自定义停用词\n",
    "#     my_stop_words = []\n",
    "#     stop_words.extend(my_stop_words)               \n",
    "\n",
    "    # 分词\n",
    "    word_num = jieba.lcut(content_series.str.cat(sep='。'), cut_all=False)\n",
    "\n",
    "    # 条件筛选\n",
    "    word_num_selected = [i for i in word_num if i not in stop_words and len(i)>=2]\n",
    "    \n",
    "    return word_num_selected"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "9985bd7a",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Building prefix dict from the default dictionary ...\n",
      "Loading model from cache C:\\Users\\ADMINI~1\\AppData\\Local\\Temp\\jieba.cache\n",
      "Loading model cost 0.580 seconds.\n",
      "Prefix dict has been built successfully.\n"
     ]
    }
   ],
   "source": [
    "text = get_cut_words(content_series=df_tb['title'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1db0b1ef-12b8-4190-bad7-f939b6068ce3",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

最后

今天的案例分享到这里就结束啦

下篇文章再见吧

请添加图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/772615.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

017 - STM32学习笔记 - SPI读写FLASH(二)-flash数据写入与读取

016 - STM32学习笔记 - SPI访问Flash&#xff08;二&#xff09; 上节内容学习了通过SPI读取FLASH的JEDEC_ID&#xff0c;在flash资料的指令表中&#xff0c;还看到有很多指令可以使用&#xff0c;这节继续学习使用其他指令&#xff0c;程序模板采用上节的模板。 为了方便起…

为何异地销号这么难?这些注意事项要熟记!

最近有不少小伙伴私信小编&#xff0c;他们在网上办理的大流量手机号卡&#xff0c;用了一段时间之后想换其他的卡&#xff0c;所以想注销当前用的卡&#xff0c;但是注销的时候确实屡屡碰壁&#xff0c;程序还比较繁琐&#xff0c;有的甚至申请注销了几个月还注销不掉&#xf…

在Microsoft Excel中如何合并多个表格

如果你问那些处理数据的人,你会知道合并 Excel 文件或合并工作簿是他们日常工作的一部分。 Power Query 是将多个 Excel 文件中的数据合并或组合到一个文件中的最佳方式。你需要将所有文件存储在一个文件夹中,然后使用该文件夹将这些文件中的数据加载到高级查询编辑器中。它…

了解kubernetes部署:namespace和Node设置

节点及namespace的设置 kubectlcreate-f/opt/kubernetes/namespaces.yaml 通过此命令我们创建了如下namespace: ns-elasticsearch:elasticsearch相关  ns-rabbitmq:rabbitmq相关  ns-javashop&#xff1a;javashop应用相关 接下来我们要根据具体情况安排各个节点的部署规划…

CSS科技感四角边框

实现效果:使用before和after就可以实现,代码量不多,长度颜色都可以自己调整 <!DOCTYPE html> <html lang="en"> <head><meta charset="UTF-8"><title>Title</title><style>*{margin:0;padding:0;}html,body{…

OBS录制双屏

1.设置视频分辨率&#xff0c;假如要录制两个1920x1080分辨率的屏幕&#xff0c;那就把需要录制的分辨率改为3840x10802. 添加显示器采集 3.点击开始录制 4.最终效果

python_PyQt5开发股票指定区间K线操作工具_裸K

目录 写在前面&#xff1a; 工具使用演示&#xff1a; 代码&#xff1a; 导入包 横坐标控件、K线控件、带查询下拉列表控件 K线图控件 主界面代码 执行代码 写在前面&#xff1a; 继前面文章提到筛出低位股票后&#xff0c;想逐一查看这些股票今年的K线走势&#xff…

香港视频直播服务器需要多大的带宽(带宽计算方式)

​  香港视频直播服务器需要多大的带宽(怎么计算带宽大小)。目前短视频行业兴起&#xff0c;有许多人也想利用香港服务器搭建一个直播平台&#xff0c;但无奈不知道怎么选择资源大小&#xff0c;或者说什么样的配置能够满足直播的需求。关于直播的带宽大小和流量消耗的计算同…

记录一次抓取WiFi驱动日志以及sniffer日志

起因 路由器桥接一个WiFi&#xff0c;然后设备连接这个路由器的WiFi&#xff0c;发现网络不可用&#xff0c;而手机或者电脑连接就没问题&#xff0c;与供应商沟通问题&#xff0c;需要抓取日志&#xff0c;记录一下 抓取WLAN DRIVER WLAN FW3日志 进入开发者模式打开启动WL…

hive常用方法

日期类 Date_sub 日期进行加减 &#xff0c;正的减&#xff0c;负的加 select current_date -- 当前日期,date_sub(current_date,1) -- 前一日,date_sub(current_date,-1) -- 后一日 from edw.test;字符类 split 该函数是分割字符串 &#xff0c;按照…

2023 年中国大学生计算机设计大赛上海决赛区正式开启!

中国大学生计算机设计大赛&#xff08;下文简称“大赛”&#xff09;是由教育部认证、我国高校面向本科生最早的赛事之一&#xff0c;自 2008 年开赛起&#xff0c;至今已是第十六届。大赛属于全国普通高校大学生竞赛排行榜榜单赛事&#xff0c;由教育部高校与计算机相关的教指…

结构型模式 - 组合模式

概述 对于这个图片肯定会非常熟悉&#xff0c;上图我们可以看做是一个文件系统&#xff0c;对于这样的结构我们称之为树形结构。在树形结构中可以通过调用某个方法来遍历整个树&#xff0c;当我们找到某个叶子节点后&#xff0c;就可以对叶子节点进行相关的操作。可以将这颗树理…

MySql 优化实例:修改 cross join 方式为子查询方式,以求改变执行计划

MySql 优化实例:修改 cross join 方式为子查询方式,以求改变执行计划 问题来源问题的追溯尝试使用索引排除日志表,验证查询速度变形查询指令修改程序中的调用指令对原有查询条件进行位置调整事后总结in 的使用初学者建议执行计划问题来源 问题内容出自问答:https://ask.cs…

【Hydro】HBV-light模型介绍及下载

HBV-light模型 HBV模型是一种模拟流域径流的半分布式水文模型。 什么是HBV-light&#xff1f; HBV模型软件除了原版&#xff08;版本由S. Bergstrm1976年开发&#xff09;之外还有很多不同版本。HBV-light在其先前版本中已在乌普萨拉大学开发&#xff08;并在俄勒冈州州立大…

业务开发“银弹” ——低代码开发平台

一、现状 低代码开发平台要让每个人&#xff0c;包括开发者和普通业务人员&#xff0c;都能够成为企业数字化过程中的主导者和构建者&#xff01;让普通人更容易上手&#xff01; 基于这一目标&#xff0c;应用需求多的云服务商成为低代码投资的主要来源。一家云服务商如谷歌云…

性能测试需求分析怎么做?(中)

本系列文章我们为大家系统地介绍一下性能测试需求分析&#xff0c;让大家全面掌握性能测试的第一个环节。本系列文章将会从性能测试需求分析整体概述、性能测试需求分析内容、性能测试需求分析方法这三个方面进行展开。在&#xff08;上&#xff09;部分中&#xff0c;我们为大…

linux之Ubuntu系列(六)用户管理 终端命令 which 查看执行命令所在的位置

提示 /etc/passwd 是用于保存用户信息的文件 可以用cat 命令查看 cat /etc/passwd/usr/bin/passwd 是用于修改用户密码的 程序 &#xff0c;是程序 程序 &#xff0c; which 命令 可以查看执行命令所在的位置 # 输出 /bin/ls which ls # 输出 /usr/sbin/useradd which useradd…

安达发|某大厂使用APS计划排程真实成功案例

在很多群里、朋友圈、公众号上可以看到&#xff0c;很多精益咨询老师认为&#xff0c;不仅ERP不啥用&#xff0c;APS更是无聊之举&#xff0c;而且肯定是用不好的。但&#xff0c;事实上可能还真不是这样的。 一个深圳的客户&#xff0c;用了APS以后&#xff0c;不仅装配的齐套…

【AI绘画】AI绘画乐趣:稳定增强扩散技术展现

目录 前言一、Stable Diffusion是什么&#xff1f;二、安装stable-diffusion-webui1. python安装2. 下载模型3. 开始安装&#xff1a;4. 汉化&#xff1a;5. 模型使用&#xff1a;6. 下载新模型&#xff1a;7. 基础玩法 三、总结 前言 本文将借助stable-diffusion-webui项目来…

[sinlinx-v3s]mke2fs

简介 mke2fs命令是Linux中的一个磁盘格式化命令&#xff0c;用于创建一个新的ext2、ext3或ext4文件系统。它可以将一个分区或者一个整个磁盘设备格式化为ext2、ext3或ext4文件系统&#xff0c;以便在Linux系统中进行数据存储和管理。 mke2fs命令的作用是按照指定的文件系统类…