1.欢迎点赞、关注、批评、指正,互三走起来,小手动起来!
- 【python014】Python爬取并解析潮汐天气简报-潮历数据,源代码下载
- 【python014】Python爬取并解析潮汐天气简报-潮历数据,源代码下载
- 【python014】Python爬取并解析潮汐天气简报-潮历数据,源代码下载
文章目录
- 1.简要介绍
- 2.`Python`版数据爬取、解析代码
- 3.参考地址
1.简要介绍
- 参考潮汐数据,网站地址:潮汐表 快速导航,做进一步层次应用。数据可视化效果如下图所示:
- 发现大语言模型写的并不能解决问题,也可能是没付费的原因,haha
- 趁着空闲的时间撸了段代码,供已参考
2.Python
版数据爬取、解析代码
- 源代码
import re import requests import pandas as pd from datetime import datetime from bs4 import BeautifulSoup import warnings warnings.filterwarnings('ignore') pd.set_option('display.width', 500) pd.set_option('display.max_rows', 200) pd.set_option('display.max_columns', 200) pd.set_option('display.max_colwidth', 1000) # 浙江-宁波市-松兰山 zj_nb_sl_url = 'https://www.eisk.cn/Calendar/1259.html' zj_nb_sl_name = '浙江-宁波市-松兰山' # 浙江-象山县-西泽 zj_xs_xz_url = 'https://www.eisk.cn/Calendar/463.html' zj_xs_xz_name = '浙江-象山县-西泽' # 浙江-象山县-石浦港 zj_xs_sp_url = 'https://www.eisk.cn/Calendar/460.html' zj_xs_sp_name = '浙江-象山县-石浦港' wl15tcxb = '未来15天潮汐表' def parser_response_html( _url, _name ): response = requests.get( _url ) # 发送HTTP请求 response.raise_for_status() # 如果请求失败,则抛出HTTPError异常 # 解析HTML soup = BeautifulSoup(response.text, 'html.parser') og_title_desc = soup.find(attrs={"property": "og:description"})['content'] result_html = [] for a_slice in soup.find_all('a', href = re.compile(r'/Tides/\d+\.html\?date=.*?')): hour = a_slice.find('div', class_='hour').text.strip() hour_date = parser_ymd_date( hour ) day = a_slice.find('div', class_='day').text.strip() _temperature = a_slice.find('div', class_='temperature').text.strip() humidity = a_slice.find('div', class_='humidity').text.strip() temperature = a_slice.find('div', class_ = 'temperature', style = re.compile(r'color:.*?') ).text.strip() tide2 = ';'.join( [','.join( c.string.strip() for c in t.contents ) for t in a_slice.find_all('div', attrs={'class':'tide2'})] ) skycon = a_slice.find('div', class_='skycon').text.strip() visibility = a_slice.find('div', class_='visibility').text.strip() dswrf = a_slice.find('div', class_='dswrf').text.strip() wave2 = a_slice.find('div', class_='wave2').text.strip() wave1 = a_slice.find('div', class_='wave1').text.strip() wave3 = '' if a_slice.find('div', class_='wave3'): wave3 = a_slice.find('div', class_='wave3').text.strip() wind2 = a_slice.find('div', class_='wind2').text.strip() wind1 = a_slice.find('div', class_='wind1').text.strip() wind3 = a_slice.find('div', class_='wind3').text.strip() description = '' for desc in a_slice.find_all('div', class_='description'): description += desc.text.strip().replace('\n', '').replace('\r', '').replace(' ', '') result_html.append( [og_title_desc, hour_date, day, _temperature, humidity, temperature, tide2, skycon, visibility, dswrf, wave2, wave1, wave3, wind2, wind1, wind3 , description] ) result_df = pd.DataFrame( result_html ) return result_df zj_nb_sl_df = parser_response_html( zj_nb_sl_url, zj_nb_sl_name ) zj_xs_xz_df = parser_response_html( zj_xs_xz_url, zj_xs_xz_name ) zj_xs_sp_df = parser_response_html( zj_xs_sp_url, zj_xs_sp_name ) union_df = pd.concat( [zj_nb_sl_df, zj_xs_xz_df, zj_xs_sp_df] ) union_df.columns = [ 'title' ,'hour_date' ,'day' ,'lunar_calendar' , 'humidity' ,'temperature' ,'tide2' ,'skycon' ,'visibility' ,'dswrf' ,'wave2' ,'wave1' , 'wave3' ,'wind2' ,'wind1' ,'wind3 ' ,'description'] union_df.to_excel('./%s-%s-%s-%s.xlsx' % ( zj_nb_sl_name ,zj_xs_xz_name ,zj_xs_sp_name ,wl15tcxb ), index=None, encoding='utf8')
- 数据样例解析效果如下图所示:
3.参考地址
- 潮汐表 快速导航
- 【python014】Python爬取并解析潮汐天气简报-潮历数据源代码下载