随着现在应用内图片越来越多,安全审查也是必不可少的一个操作了
下面手把手教你如何将huggingface中的黄图检测模型部署到自己的服务器上去
1.找到对应的模型 nsfw_image_detection
2.在本地先验证如何使用
首先安装transformers python库
pip install transformers(用于导入ai模型,运行ai模型)
安装机器学习库
pip install torch
安装PIL库
pip install Pillow(用于将图片加载到内存供模型识别)
将上面网址对应里面的示例代码运行
from PIL import Image
from transformers import pipeline
img = Image.open("<path_to_image_file>")
classifier = pipeline("image-classification", model="Falconsai/nsfw_image_detection")
classifier(img)
发现可以出结果,那制作镜像就按照上述的逻辑做了
3.制作Dockerfile文件
FROM python:3.9-slim
WORKDIR /app
RUN echo "deb http://mirrors.ustc.edu.cn/debian/ bullseye main contrib non-free" > /etc/apt/sources.list && \
echo "deb http://mirrors.ustc.edu.cn/debian-security bullseye-security main" >> /etc/apt/sources.list && \
echo "deb http://mirrors.ustc.edu.cn/debian/ bullseye-updates main contrib non-free" >> /etc/apt/sources.list && \
echo "deb https://mirrors.aliyun.com/debian bookworm main contrib non-free" > /etc/apt/sources.list && \
echo "deb https://mirrors.aliyun.com/debian-security bookworm-security main" >> /etc/apt/sources.list && \
echo "deb https://mirrors.aliyun.com/debian bookworm-updates main contrib non-free" >> /etc/apt/sources.list && \
apt-get update && \
apt-get install -y cmake
RUN pip3 install transformers datasets evaluate accelerate -i https://mirrors.aliyun.com/pypi/simple/
RUN pip3 install torch -i https://mirrors.aliyun.com/pypi/simple/
上述Dockerfile为一个base的dockerfile,由于我们的项目是流水线自动化部署,每次代码更新都会制作一个最新的镜像,但是上述这种下载依赖的重复逻辑不想每次制作镜像的时候都重复执行,所以先制作一个只下载环境的base镜像,将制作好的镜像传到镜像仓库,假设我的镜像是 wf.com/base/huggingface:2.0
FROM wf.com/base/huggingface:2.0
RUN mkdir -p /app/
RUN pip install Pillow -i https://mirrors.aliyun.com/pypi/simple/
ENV HF_ENDPOINT="https://hf-mirror.com"
WORKDIR /app
COPY * .
CMD ["sh","-c","python app.py"]
这个是我制作的流水线镜像
app.py的逻辑是
from transformers import pipeline
from PIL import Image
from io import BytesIO
import requests
import json
from http.server import HTTPServer, BaseHTTPRequestHandler
from urllib.parse import parse_qs
s = requests.Session()
classifier = pipeline("image-classification", model="Falconsai/nsfw_image_detection")
class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
# 发送HTTP头部
self.send_response(200)
self.send_header('Content-type', 'text/plain')
self.end_headers()
# 获取GET参数
params = parse_qs(self.path.split('?')[1])
param_value = params.get('url', [''])[0] # 假设参数名为'param'
response = s.get(param_value)
image = Image.open(BytesIO(response.content))
res = classifier(image)
# 发送响应内容
message = 'Received GET request with param value: {}'.format(param_value)
self.wfile.write(json.dumps(res).encode('utf-8'))
if __name__ == '__main__':
httpd = HTTPServer(('0.0.0.0', 80), SimpleHTTPRequestHandler)
print("Serving at http://localhost:80")
httpd.serve_forever()
上述代码就是启动一个80端口,接收一个url参数,将url对应的图片识别是否涉黄
k8s.yaml
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hugging-nsfw
namespace: test
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "0"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
spec:
rules:
- host: hugging-nsfw.test.local.xxxx.com
http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: hugging-nsfw
port:
number: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
app: hugging-nsfw
jmx-type: service-jvm
name: hugging-nsfw
namespace: test
spec:
ports:
- name: http
port: 80
targetPort: 80
selector:
app: hugging-nsfw
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: hugging-nsfw
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: hugging-nsfw
strategy:
rollingUpdate:
maxSurge: 50%
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: hugging-nsfw
spec:
containers:
- name: app
image: wf.com/repo/hugging-nsfw:test--14877
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
name: http
stdin: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
tty: true
resources:
requests:
cpu: 256m
memory: 1024Mi
ephemeral-storage: 100Mi
limits:
cpu: 4000m
memory: 8Gi
ephemeral-storage: 10Gi
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
dnsPolicy: ClusterFirst
terminationGracePeriodSeconds: 100
imagePullSecrets:
- name: regcred
通过将这个yaml文件部署到k8s中一个自建的鉴黄ai模型就部署好了
效果展示
黄图结果:
非黄图结果:
多次验证后发现模型的准确率比较高,在应用内出现的黄图基本能够识别出来,识别速度也挺快的,经常会超过100ms是因为我这里都是识别网络图片需要先下载再识别,如果是本地图片速度更快
结语:
现在ai发展迅速,作为一个程序员不会训练也要会使用,只要上面这一套能跑通,那么huggingface上所有的ai模型都是一样的思路,甚至飞桨,modescope,ollama等等基本思路类似