本课程使用的是 超级鹰 打码平台, 没有账户的请自行注册!
超级鹰验证码识别-专业的验证码云端识别服务,让验证码识别更快速、更准确、更强大
使用打码平台来攻破验证码难题, 是很简单容易的, 但是要钱!
案例代码及测试资源:
git clone https://github.com/Python3WebSpider/CaptchaPlatform.git
使用git 将资源拽取下来, 然后你会发现多了一个文件夹, 文件夹中有一个chaojiying.py 文件, 这里面就是基于官方 SDK 改写的代码:
username: 用户名 , 你注册的超级鹰用户。
password:密码
soft_id: 软件ID。
import requests
from hashlib import md5
class Chaojiying(object):
def __init__(self, username, password, soft_id):
self.username = username
self.password = md5(password.encode('utf-8')).hexdigest()
self.soft_id = soft_id
self.base_params = {
'user': self.username,
'pass2': self.password,
'softid': self.soft_id,
}
self.headers = {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)',
}
def post_pic(self, im, codetype):
"""
im: 图片字节
codetype: 题目类型 参考 http://www.chaojiying.com/price.html
"""
params = {
'codetype': codetype,
}
params.update(self.base_params)
files = {'userfile': ('ccc.jpg', im)}
r = requests.post('http://upload.chaojiying.net/Upload/Processing.php', data=params, files=files,
headers=self.headers)
return r.json()
def report_error(self, im_id):
"""
im_id:报错题目的图片ID
"""
params = {
'id': im_id,
}
params.update(self.base_params)
r = requests.post('http://upload.chaojiying.net/Upload/ReportError.php', data=params, headers=self.headers)
return r.json()
图片验证码:
CAPTCHA_KIND 则为图片的类型, 可以在 验证码类型与价格表-超级鹰验证码识别 看到。
from chaojiying import Chaojiying
USERNAME = '136xxxx108'
PASSWORD = 'xxxxxx'
SOFT_ID = 'xxxxxxxx'
CAPTCHA_KIND = '1006'
FILE_NAME = 'captcha1.png'
client = Chaojiying(USERNAME, PASSWORD, SOFT_ID)
result = client.post_pic(open(FILE_NAME, 'rb').read(), CAPTCHA_KIND)
print(result)
点选验证码:
from chaojiying import Chaojiying
USERNAME = 'xxx'
PASSWORD = ''
SOFT_ID = 'xxxxxx'
CAPTCHA_KIND = '9004'
FILE_NAME = 'captcha2.png'
client = Chaojiying(USERNAME, PASSWORD, SOFT_ID)
result = client.post_pic(open(FILE_NAME, 'rb').read(), CAPTCHA_KIND)
print(result)
得到的响应结果为:
{'err_no': 0, 'err_str': 'OK', 'pic_id': '2256514491185230017', 'pic_str': '118,177|249,173', 'md5': 'e89f632e91cc6b8a85dad2fbbc13c803'}
可以看到图片的坐标信息为: '118,177|249,173' 使用opencv 技术来标记这个点测试一下:
import cv2
image = cv2.imread('captcha2.png')
image = cv2.circle(image, (108, 133), radius=10, color=(0, 0, 255), thickness=-1)
image = cv2.circle(image, (227, 143), radius=10, color=(0, 0, 255), thickness=-1)
cv2.imwrite('captcha2_label.png', image)
滑块验证码:
from chaojiying import Chaojiying
USERNAME = '136xxxx08'
PASSWORD = 'hxxxxx.'
SOFT_ID = '9xxxx'
CAPTCHA_KIND = '9101'
FILE_NAME = 'captcha5.png'
client = Chaojiying(USERNAME, PASSWORD, SOFT_ID)
result = client.post_pic(open(FILE_NAME, 'rb').read(), CAPTCHA_KIND)
print(result)
{'err_no': 0, 'err_str': 'OK', 'pic_id': '1256519431185230022', 'pic_str': '218,96', 'md5': '627d620bccd9a6dd1366329b951f1511'}
使用OpenCV测试验证一下:
import cv2
image = cv2.imread('captcha2.png')
image = cv2.circle(image, (231, 85), radius=10, color=(0, 0, 255), thickness=-1)
cv2.imwrite('captcha3_label.png', image)
可以看到, 不是很准确, 我们可以向打码平台的工作人员传递一些信息, 尽可能的标记的准确一些:
from chaojiying import Chaojiying
import cv2
from PIL import ImageFont, ImageDraw, Image
import numpy as np
import io
def cv2_add_text(image, text, left, top, textColor=(255, 0, 0), text_size=20):
image = Image.fromarray(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
draw = ImageDraw.Draw(image)
font = ImageFont.truetype('simsun.ttc', text_size, encoding="utf-8")
draw.text((left, top), text, textColor, font=font)
return cv2.cvtColor(np.asarray(image), cv2.COLOR_RGB2BGR)
USERNAME = '136XXXX08'
PASSWORD = 'hXXXXXXXXXXX..'
SOFT_ID = '9XXXXXXX2'
CAPTCHA_KIND = '9101'
FILE_NAME = 'captcha3.png'
image = cv2.imread(FILE_NAME)
image = cv2_add_text(image, '请点击目标滑块左上角', int(image.shape[1] / 10), int(image.shape[0] / 2), (255, 0, 0), 40)
client = Chaojiying(USERNAME, PASSWORD, SOFT_ID)
result = client.post_pic(io.BytesIO(cv2.imencode(
'.png', image)[1]).getvalue(), CAPTCHA_KIND)
print(result)