mitmproxy is a free and open source interactive HTTPS proxy.
这官网上的一句话说明mitmproxy的身份,MITM 即中间人攻击(Man-in-the-middle attack),与charles、fidder之类的抓包工具不同的是可以增加一些自定义处理的扩展脚本(Python语言)来实现一些功能;
安装
地址
官网地址: https://mitmproxy.org/
github地址: https://github.com/mitmproxy
pypi地址: https://pypi.org/project/mitmproxy/
pip安装
pip install mitmproxy
安装验证
完成后,系统将拥有 mitmproxy、mitmdump、mitmweb 三个命令,由于 mitmproxy 命令不支持在 windows 系统中运行(这没关系,不用担心),可以拿 mitmdump 测试一下安装是否成功,执行
mitmdump --version
来源:
Mitmproxy的使用_mitmproxy使用_xian_wwq的博客-CSDN博客
docker搭建mitmproxy环境
docker pull mitmproxy/mitmproxy
docker run --rm -it -p 8082:8080 -p 8081:8081 -v /data/mitm/script/mitm/:/mitm/ mitmproxy/mitmproxy mitmweb -s /mitm/addons.py --set block_global=false --web-iface 0.0.0.0
证书配置
pc电脑安装证书
mitmproxy 中想要截取 HTTPS 请求,就需要设置证书,mitmprxoy 在安装设置后会提供一套 CA 证书,只要客户端信任了 mitmproxy 提供的证书,就可以通过 mitmproxy 获取 HTTPS 请求的具体内容,否则 mitmproxy 无法解析 HTTPS 请求,启动 mitmdump 会产生 CA 证书:
打开文件夹资源管理器:C:\Users\Administrator\.mitmproxy
点击 mitmproxy-ca.p12 进入证书导入向导,直接点击下一页
手机模拟器安装证书
将 mitmproxy-ca-cert.pem 直接拖拽到模拟器中,会自动保存至共享文件夹
打开设置--点击安全--选择从SD卡安装--找到证书文件--点击安装
来源
mitmproxy 的安装使用 与 模拟器上的证书配置_mitmproxy证书_Yy_Rose的博客-CSDN博客
命令使用
mitmproxy 安装以后提供了三个执行程序:mitmproxy, mitmdump, mitmweb,直接在控制台输入即可。
mitmproxy:提供了 shell 交互式的抓包界面,但是只能在 Linux 环境中使用
mitmdump:后台抓包,一般windows下都是使用这个命令,本文案例就是使用它来执行抓包。
mitmweb:会在默认浏览器打开一个抓包可视化的界面,一般很少用到。
常用的参数:
-w 指定输出的文件
-s 指定抓包时执行的脚本
mitmdump -s xxx.py
来源
通过mitmproxy爬取西瓜视频app数据保存到mongodb数据库_1yshu的博客-CSDN博客_西瓜视频抓包
脚本定制
我们经常用的方法是这两个
def request(self, flow: mitmproxy.http.HTTPFlow):
def response(self, flow: mitmproxy.http.HTTPFlow):
request()
属性 描述
request = flow.request 获取到request对象,对象包含了诸多属性,保存了请求的信息
request.url 请求的url(字符串形式),修改url并不一定会生效,因为url是整体的,包含了host、path、query,最好从分体中修改
request.host 请求的域名,字符串形式
request.headers 请求头,Headers形式(类似于字典)
request.content 请求内容(byte类型)
request.text 请求内容(str类型)
request.json() 请求内容(dict类型)
request.data 请求信息(包含协议、请求头、请求体、请求时间、响应时间等内容)
request.method 请求方式,字符串形式,如POST、GET等
request.scheme 协议,字符串形式,如http、https
request.path 请求路径,字符串形式,即url中除了域名之外的内容
request.query url中的键值参数,MultiDictView类型的数据(类似于字典)
request.query.keys() 获取所有请求参数键值的键名
request.query.get(keyname) 获取请求参数中参数名为keyname的参数值
response()
属性 描述
response = flow.response 获取到response对象,对象包含了诸多属性,保存了请求的响应信息
response.status_code 响应码
response.text 响应数据(str类型)
response.content 响应数据(Bytes类型)
response.headers 响应头,Headers形式(类似于字典)
response.cookies 响应的cookie
response.set_text() 修改 响应数据
response.get_text() 响应数据(str类型)
flow.response = flow.response.make(status_code, content, headers) 设置响应信息
来源:
https://www.cnblogs.com/yoyo1216/p/16165758.html
mitmproxy_wenxiaoba的博客-CSDN博客
python脚本样例
import mitmproxy.http
import pickle
import os
import json
class GetSeq:
def __init__(self, domains=[], url_pattern=None, ):
self.num = 1
self.dirpath = "./flows/"
if not os.path.exists(self.dirpath):
os.mkdir(self.dirpath)
self.domains = domains
self.url_pattern = url_pattern
def http_connect(self, flow: mitmproxy.http.HTTPFlow):
"""
An HTTP CONNECT request was received. Setting a non 2xx response on
the flow will return the response to the client abort the
connection. CONNECT requests and responses do not generate the usual
HTTP handler events. CONNECT requests are only valid in regular and
upstream proxy modes.
"""
def requestheaders(self, flow: mitmproxy.http.HTTPFlow):
"""
HTTP request headers were successfully read. At this point, the body
is empty.
"""
def request(self, flow: mitmproxy.http.HTTPFlow):
"""
The full HTTP request has been read.
"""
def responseheaders(self, flow: mitmproxy.http.HTTPFlow):
"""
HTTP response headers were successfully read. At this point, the body
is empty.
"""
def response(self, flow: mitmproxy.http.HTTPFlow):
"""
The full HTTP response has been read.
"""
# 自行更改这里的保存代码,此处仅供参考
def save_flow():
fname = "{}flow-{:0>3d}-{}.pkl".format(self.dirpath, self.num, flow.request.host)
pickle.dump({
"num": self.num,
"request": flow.request,
"response": flow.response
}, open(fname, "wb"))
log_data = dict(
num = self.num,
url = flow.request.url,
fname = fname
)
with open("flow_que.log", "a+", encoding="utf8") as f:
s = json.dumps(log_data)
f.write(s)
self.num += 1
# 添加自己的过滤需求
if flow.request.headers.get('content-type', None) == "application/json":
save_flow()
if len(self.domains) == 0: save_flow()
for domain in self.domains:
if domain in flow.request.url:
save_flow()
def error(self, flow: mitmproxy.http.HTTPFlow):
"""
An HTTP error has occurred, e.g. invalid server responses, or
interrupted connections. This is distinct from a valid server HTTP
error response, which is simply a response with an HTTP error code.
"""
addons = [
GetSeq(
domains=[
"baidu.com",
],
url_pattern = None,
)
]
保存json进入mysql
import mitmproxy.http
import pickle
import os
import json
import pymysql
from pymysql.converters import escape_string
class GetSeq:
def __init__(self, domains=[], url_pattern=None, ):
self.num = 1
self.dirpath = "./flows/"
if not os.path.exists(self.dirpath):
os.mkdir(self.dirpath)
self.domains = domains
self.url_pattern = url_pattern
def http_connect(self, flow: mitmproxy.http.HTTPFlow):
"""
An HTTP CONNECT request was received. Setting a non 2xx response on
the flow will return the response to the client abort the
connection. CONNECT requests and responses do not generate the usual
HTTP handler events. CONNECT requests are only valid in regular and
upstream proxy modes.
"""
def requestheaders(self, flow: mitmproxy.http.HTTPFlow):
"""
HTTP request headers were successfully read. At this point, the body
is empty.
"""
def request(self, flow: mitmproxy.http.HTTPFlow):
"""
The full HTTP request has been read.
"""
def responseheaders(self, flow: mitmproxy.http.HTTPFlow):
"""
HTTP response headers were successfully read. At this point, the body
is empty.
"""
def response(self, flow: mitmproxy.http.HTTPFlow):
"""
The full HTTP response has been read.
"""
# 自行更改这里的保存代码,此处仅供参考
def save_flow():
url=flow.request.url
print("url*********************"+url)
if(url.startswith("https://www.douyin.com/aweme/v1/web/aweme/post/")):
print("url=====>>>"+url)
self.save2db(url,flow.response.text)
#快手
# print("url=========>>>>>>>>>>"+flow.request.url)
if(flow.request.url.startswith("https://www.kuaishou.com/graphql")):
post_data=flow.request.text
print("post_url========>>>>>>"+flow.request.url)
print("post_data====="+post_data)
print("resp_txt===="+flow.response.text)
# 添加自己的过滤需求
if(flow.request.url.startswith("https://www.douyin.com/aweme/v1/web/aweme/post/")):
# with open("flow_que.log_"+str(self.num), "a+", encoding="utf8") as f:
# s = flow.response.text
# f.write(s)
try:
flow.request.urlencoded_form.keys()
save_flow()
except Exception as e:
print("save error happen "+str(e))
self.num += 1
def error(self, flow: mitmproxy.http.HTTPFlow):
"""
An HTTP error has occurred, e.g. invalid server responses, or
interrupted connections. This is distinct from a valid server HTTP
error response, which is simply a response with an HTTP error code.
"""
def format_field( self,msg):
print("msg=="+msg)
#mitmdump中字符串保存mysql必须自己加引号,json的保存必须用escape_string转换一下
return "\""+escape_string(str(msg))+"\""
#保存mysql
def save2db(self,url,resp_txt):
# 连接数据库
conn = pymysql.connect(host='192.168.10.231',
port=3307,
user='bj',
password='bj2016',
database='test')
# 建立cursor游标
cursor = conn.cursor()
sql_tmp="""insert into test.mitmproxy_log (url,resp_txt) values (%s,%s) """
url=self.format_field(url)
resp_txt=self.format_field(resp_txt)
sql=sql_tmp%(url,resp_txt)
try:
cursor.execute(sql)
conn.commit()
except Exception as e:
conn.rollback()
print("insert error "+str(e))
finally:
cursor.close()
conn.cursor()
addons = [
GetSeq(
domains=[
"baidu.com",
],
url_pattern=None,
)
]
来源
mitmproxy_录制接口并保存到mysql(踩坑史)_你是猴子请来的救兵吗!!的博客-CSDN博客_mitmproxy怎么将访问记录存入数据库
https://www.cnblogs.com/lynsha/p/16517354.html