以前的客户地址比较乱,现在想提取出省份城市,
开始了解分词技术,后发现python有这样的库
cpca提取地址挺不错,可以从垃圾地址中提取省市区以及区号。
文章会用fastapi搭建服务端
通过post调用cpca,提取来了后,返回给用户。
我的环境win7 python3.10
win7安装python3.10,前面有文章介绍,也有安装包
首先安装
pip install fastapi
正常情况下,现在可以使用fastapi了,而我比较波折
装完后写demo运行fastapi,import那就出错了
后来我不断降级
代码:
from fastapi import FastAPI
from fastapi import FastAPI, Request, Body
import uvicorn
from hashlib import md5
import cpca
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "你好"}
@app.get("/bar")
async def read_item(name: str,age: int = 28):
return {"name":name,"age":age}
@app.post("/soso")
async def read_item(
foo: int = Body(1,title="描述",embed=True),
age: int = Body(..., le=120, title="年龄", embed=True),
name: str = Body(..., regex="^xiao\d+$", embed=True)):
return {"foo": foo, "age": age, "name": name}
@app.post("/cpca")
async def read_item(
timestamp: str = Body(min_length=13,title="时间戳",embed=True),
key: str = Body(...,min_length=32, title="秘钥", embed=True),
content: str = Body(...,min_length=2, embed=True)):
key2 = timestamp+content
key2 = md5(key2.encode(encoding='UTF-8')).hexdigest()
key2 = md5(key2.encode(encoding='UTF-8')).hexdigest()
if key==key2:
df = cpca.transform([content])
if df.size == 0:
return {"status":False}
else:
province = df.省[0]
city = df.市[0]
district = df.区[0]
areacode = df.adcode[0]
if province==None:
province = ""
if city==None:
city = ""
if district==None:
district = ""
if areacode==None:
areacode = ""
return {"status":True,"content": content,"province":province, "city":city, "district":district, "areacode":areacode}
else:
return {"status":False}
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8001)
服务如果能正常运行,则可在服务器部署python环境,
或者用pyinstaller打包成exe,放服务器运行,打包时
需要注意cpca有个资源文件夹要一起打包。。。。