古诗文网简介
- 古诗文网登录界面
- 模拟登录两大难点
- __VIEWSTATE、__VIEWSTATEGENERATOR两个动态参数
- 验证码
python登录
- 1、解析登录界面,获取__VIEWSTATE、__VIEWSTATEGENERATOR参数
- 2、获取登陆界面验证码图片,输入验证码
- 3、模拟登录
- **要使用requests.session方法,访问同一个界面。防止在进行验证码下载时页面刷新,post请求网页更改**
import requests
from bs4 import BeautifulSoup
import urllib
url = "https://so.gushiwen.cn/user/login.aspx?from=http://so.gushiwen.cn/user/collect.aspx"
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36'}
response = requests.get(url=url, headers=headers)
content = response.text
soup = BeautifulSoup(content, 'lxml')
viewsate = soup.select('#__VIEWSTATE')[0].attrs.get('value')
viewsate_generator = soup.select('#__VIEWSTATEGENERATOR')[0].attrs.get('value')
Verification_src = soup.select('#imgCode')[0].attrs.get('src')
Verification_url = 'https://so.gushiwen.cn' + Verification_src
session = requests.session()
response = session.get(Verification_url)
with open('./practice_088_古诗文网/Verification.jpg', 'wb') as fp:
fp.write(response.content)
code_name = input("请输入验证码:")
url_post = "https://so.gushiwen.cn/user/login.aspx?from=http%3a%2f%2fso.gushiwen.cn%2fuser%2fcollect.aspx"
data_post = {'__VIEWSTATE': viewsate,
'__VIEWSTATEGENERATOR': viewsate_generator,
'from': 'http://so.gushiwen.cn/user/collect.aspx',
'email': '12345678@qq.com',
'pwd': '12345678',
'code': code_name,
'denglu': '登录'}
response_post = session.post(url=url_post, data=data_post, headers=headers)
content_post = response_post.text
with open('古诗文网.html', 'w', encoding='utf-8') as fp:
fp.write(content_post)