反爬虫之代理IP封禁-协采云IP池
- 1、目标网址
- 2、IP封禁403
- 3、协采云IP池
1、目标网址
aHR0cDovL3d3dy5jY2dwLXRpYW5qaW4uZ292LmNuLw==
2、IP封禁403
这个网站对IP的要求很高,短时间请求十几次就会遭关进小黑屋。如下图:
明显是网站进行了反爬处理:限制IP请求频率。这个时候,我们只有加代理进行访问请求了。但是网上公开的代理虽然说是免费,但其IP的响应速度、存活时间、隐匿性等质量是无法保证的。这里推荐一款最近发现的代理商家:协采云IP池
3、协采云IP池
demo:
import requests
import json
import time
#API链接 后台获取链接地址
proxyAPI = ""
proxyusernm = "" #代理帐号
proxypasswd = "" #代理密码
url='https://myip.ipip.net/'
#获取IP
r = requests.get(proxyAPI)
if(r.status_code == 200):
j = json.loads(r.text)
if(j["success"] and len(j["result"]) > 0):
p=j["result"][0]
#name = input();
proxyurl="http://"+proxyusernm+":"+proxypasswd+"@"+p["ip"]+":"+"%d"%p["port"]
t1 = time.time()
r = requests.get(url,proxies={'http':proxyurl,'https':proxyurl},headers={
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Encoding":"gzip, deflate",
"Accept-Language":"zh-CN,zh;q=0.9",
"Cache-Control":"max-age=0",
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"})
r.encoding='utf-8'
t2 = time.time()
print(r.text)
print("时间差:" , (t2 - t1));
else:
print('获取0个代理IP')
else:
print('获取代理失败')
实战:
```python
def get_ip_one():
# API链接 后台获取链接地址
proxyAPI = "http://19122421898.user.xiecaiyun.com/api/proxies?action=getJSON&key=NP4D0E6891&count=&word=&rand=true&norepeat=false&detail=false<ime=3&idshow=false"
proxyusernm = "19122421898" # 代理帐号
proxypasswd = "19122421898" # 代理密码
try:
# 获取IP
r = requests.get(proxyAPI)
if (r.status_code == 200):
j = json.loads(r.text)
if (j["success"] and len(j["result"]) > 0):
p = j["result"][0]
proxyurl = "http://" + proxyusernm + ":" + proxypasswd + "@" + p["ip"] + ":" + "%d" % p["port"]
return {'http': proxyurl, 'https': proxyurl}
except:
print('代理获取超时,jia')
return {'http': 'http://dsk20180808:dsk20170808@218.86.104.54:57114',
'https': 'https://dsk20180808:dsk20170808@218.86.104.54:57114'}
加上代理后即可解决: