愤怒归根结底是为了达成目的的一种工具和手段,大声呵斥乃至拍桌子,目的都是通过震慑对方,进而使其听自己的话,因为他们也找不到更好的办法。
系列文章目录
- 项目搭建
- App登录及网关
- App文章
- 自媒体平台(博主后台)
- 内容审核(自动)
文章目录
- 系列文章目录
- 一、阿里云接口
- 1. 自动审核流程
- 2. 阿里云服务
- ⑴. 内容安全
- ⑵. 文本检测
- ⑶. 图片检测
- 3. 项目集成
- ⑴. 工具类
- ①. 内容审核
- ②. 图片审核
- ③. 本地图片检测
- ④. 自定义图库上传图片
- ⑤. 上传凭证
- ⑵. 阿里云AK
- ⑶. 自动配置
- ⑷. 测试类
- 二、App端文章保存
- 1. 表结构说明
- 2. 分布式ID
- ⑴. 存在的问题
- ⑵. 技术选型
- ⑶. 雪花算法
- ⑷. Nacos配置
- 3. 接口实现
- ⑴. 需求分析
- ⑵. feign接口
- ①. 接口说明
- ②. ArticleDto
- ③. ResponseResult
- ⑶. 引入依赖
- ⑷. 定义接口
- ⑸. 实现接口
- ⑹. Mapper
- ⑺. Service
- ⑻. ApArticleConfig
- ⑼. ServiceImpl
- ⑽. 定义接口
- ⑾. Postman
- ①. 保存文章
- ②. 修改文章
- ⑿. 踩坑经验
- 三、文章审核
- 1. 方法定义
- ⑴. service
- ⑵. feign远程调用
- ⑶. serviceImpl
- ⑷. 测试类
- 2. feign调用服务降级
- ⑴. 实现类
- ⑵. 远程接口指向
- ⑶. 扫包
- ⑷. Nacos配置
- ⑸. 测试
- ①. 添加超时
- ②. 测试类
- 3. 异步线程调用
- ⑴. 方法注解
- ⑵. 调用
- ⑶. 引导类注解
- 4. 综合测试
- 四、审核拓展
- 1. 自管理敏感词
- ⑴. 技术选型
- ⑵. 库表
- ⑶. 实体类
- ⑷. Mapper
- ⑸. ServiceImpl
- 2. 图片文字识别
- ⑴. OCR
- ⑵. 入门案例
- ①. 添加模块
- ②. pom
- ③. 字体库
- ④. Application
- ⑶. 图片识别工具类
- ①. pom配置
- ②. 工具类
- ③. 自动配置
- ④. tess4j配置
- ⑷. 实现类
- 五、文章详情
- 1. 静态文件生成
- ⑴. 思路分析
- ⑵. Service
- ⑶. ServiceImpl
- ⑷. 上传MinIO
- ⑸. Application
一、阿里云接口
1. 自动审核流程
- 自媒体端发布文章后,开始审核文章
- 审核的主要是审核文章的内容(文本内容和图片)
- 借助第三方提供的接口审核文本
- 借助第三方提供的接口审核图片,由于图片存储到minIO中,需要先下载才能审核
- 如果审核失败,则需要修改自媒体文章的状态,status:2 审核失败 status:3 转到人工审核
- 如果审核成功,则需要在文章微服务中创建app端需要的文章
2. 阿里云服务
⑴. 内容安全
内容安全是一款对多媒体内容的风险智能检测的产品,提供图片、视频、语音、文字等多媒体的内容风险检测的能力,帮助用户发现色情、暴力、惊悚、敏感、禁限、辱骂等风险内容或元素,可以大幅度降低人工审核成本,提升内容质量,改善平台秩序和用户体验。
阿里云服务地址: https://help.aliyun.com/product/28415.html
⑵. 文本检测
文本同步检测: https://help.aliyun.com/document_detail/70439.html
文本反垃圾检测Java SDK: https://help.aliyun.com/document_detail/53427.html
⑶. 图片检测
图片同步检测: https://help.aliyun.com/document_detail/70292.html
图片审核Java SDK: https://help.aliyun.com/document_detail/53424.html
3. 项目集成
⑴. 工具类
①. 内容审核
新建 heima-leadnews-common/src/main/java/com/heima/common/aliyun/GreenTextScan.java
文件:
@Getter
@Setter
@Component
@ConfigurationProperties(prefix = "aliyun")
public class GreenTextScan {
private String accessKeyId;
private String secret;
public Map greeTextScan(String content) throws Exception {
System.out.println(accessKeyId);
IClientProfile profile = DefaultProfile
.getProfile("cn-shanghai", accessKeyId, secret);
DefaultProfile.addEndpoint("cn-shanghai", "cn-shanghai", "Green", "green.cn-shanghai.aliyuncs.com");
IAcsClient client = new DefaultAcsClient(profile);
TextScanRequest textScanRequest = new TextScanRequest();
textScanRequest.setAcceptFormat(FormatType.JSON); // 指定api返回格式
textScanRequest.setHttpContentType(FormatType.JSON);
textScanRequest.setMethod(com.aliyuncs.http.MethodType.POST); // 指定请求方法
textScanRequest.setEncoding("UTF-8");
textScanRequest.setRegionId("cn-shanghai");
List<Map<String, Object>> tasks = new ArrayList<Map<String, Object>>();
Map<String, Object> task1 = new LinkedHashMap<String, Object>();
task1.put("dataId", UUID.randomUUID().toString());
/**
* 待检测的文本,长度不超过10000个字符
*/
task1.put("content", content);
tasks.add(task1);
JSONObject data = new JSONObject();
/**
* 检测场景,文本垃圾检测传递:antispam
**/
data.put("scenes", Arrays.asList("antispam"));
data.put("tasks", tasks);
System.out.println(JSON.toJSONString(data, true));
textScanRequest.setHttpContent(data.toJSONString().getBytes("UTF-8"), "UTF-8", FormatType.JSON);
// 请务必设置超时时间
textScanRequest.setConnectTimeout(3000);
textScanRequest.setReadTimeout(6000);
Map<String, String> resultMap = new HashMap<>();
try {
HttpResponse httpResponse = client.doAction(textScanRequest);
if (httpResponse.isSuccess()) {
JSONObject scrResponse = JSON.parseObject(new String(httpResponse.getHttpContent(), "UTF-8"));
System.out.println(JSON.toJSONString(scrResponse, true));
if (200 == scrResponse.getInteger("code")) {
JSONArray taskResults = scrResponse.getJSONArray("data");
for (Object taskResult : taskResults) {
if (200 == ((JSONObject) taskResult).getInteger("code")) {
JSONArray sceneResults = ((JSONObject) taskResult).getJSONArray("results");
for (Object sceneResult : sceneResults) {
String scene = ((JSONObject) sceneResult).getString("scene");
String label = ((JSONObject) sceneResult).getString("label");
String suggestion = ((JSONObject) sceneResult).getString("suggestion");
System.out.println("suggestion = [" + label + "]");
if (!suggestion.equals("pass")) {
resultMap.put("suggestion", suggestion);
resultMap.put("label", label);
return resultMap;
}
}
} else {
return null;
}
}
resultMap.put("suggestion", "pass");
return resultMap;
} else {
return null;
}
} else {
return null;
}
} catch (ServerException e) {
e.printStackTrace();
} catch (ClientException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
}
②. 图片审核
新建 heima-leadnews-common/src/main/java/com/heima/common/aliyun/GreenImageScan.java
文件:
@Getter
@Setter
@Component
@ConfigurationProperties(prefix = "aliyun")
public class GreenImageScan {
private String accessKeyId;
private String secret;
private String scenes;
public Map imageScan(List<byte[]> imageList) throws Exception {
IClientProfile profile = DefaultProfile
.getProfile("cn-shanghai", accessKeyId, secret);
DefaultProfile
.addEndpoint("cn-shanghai", "cn-shanghai", "Green", "green.cn-shanghai.aliyuncs.com");
IAcsClient client = new DefaultAcsClient(profile);
ImageSyncScanRequest imageSyncScanRequest = new ImageSyncScanRequest();
// 指定api返回格式
imageSyncScanRequest.setAcceptFormat(FormatType.JSON);
// 指定请求方法
imageSyncScanRequest.setMethod(MethodType.POST);
imageSyncScanRequest.setEncoding("utf-8");
//支持http和https
imageSyncScanRequest.setProtocol(ProtocolType.HTTP);
JSONObject httpBody = new JSONObject();
/**
* 设置要检测的场景, 计费是按照该处传递的场景进行
* 一次请求中可以同时检测多张图片,每张图片可以同时检测多个风险场景,计费按照场景计算
* 例如:检测2张图片,场景传递porn、terrorism,计费会按照2张图片鉴黄,2张图片暴恐检测计算
* porn: porn表示色情场景检测
*/
httpBody.put("scenes", Arrays.asList(scenes.split(",")));
/**
* 如果您要检测的文件存于本地服务器上,可以通过下述代码片生成url
* 再将返回的url作为图片地址传递到服务端进行检测
*/
/**
* 设置待检测图片, 一张图片一个task
* 多张图片同时检测时,处理的时间由最后一个处理完的图片决定
* 通常情况下批量检测的平均rt比单张检测的要长, 一次批量提交的图片数越多,rt被拉长的概率越高
* 这里以单张图片检测作为示例, 如果是批量图片检测,请自行构建多个task
*/
ClientUploader clientUploader = ClientUploader.getImageClientUploader(profile, false);
String url = null;
List<JSONObject> urlList = new ArrayList<JSONObject>();
for (byte[] bytes : imageList) {
url = clientUploader.uploadBytes(bytes);
JSONObject task = new JSONObject();
task.put("dataId", UUID.randomUUID().toString());
//设置图片链接为上传后的url
task.put("url", url);
task.put("time", new Date());
urlList.add(task);
}
httpBody.put("tasks", urlList);
imageSyncScanRequest.setHttpContent(org.apache.commons.codec.binary.StringUtils.getBytesUtf8(httpBody.toJSONString()),
"UTF-8", FormatType.JSON);
/**
* 请设置超时时间, 服务端全链路处理超时时间为10秒,请做相应设置
* 如果您设置的ReadTimeout小于服务端处理的时间,程序中会获得一个read timeout异常
*/
imageSyncScanRequest.setConnectTimeout(3000);
imageSyncScanRequest.setReadTimeout(10000);
HttpResponse httpResponse = null;
try {
httpResponse = client.doAction(imageSyncScanRequest);
} catch (Exception e) {
e.printStackTrace();
}
Map<String, String> resultMap = new HashMap<>();
//服务端接收到请求,并完成处理返回的结果
if (httpResponse != null && httpResponse.isSuccess()) {
JSONObject scrResponse = JSON.parseObject(org.apache.commons.codec.binary.StringUtils.newStringUtf8(httpResponse.getHttpContent()));
System.out.println(JSON.toJSONString(scrResponse, true));
int requestCode = scrResponse.getIntValue("code");
//每一张图片的检测结果
JSONArray taskResults = scrResponse.getJSONArray("data");
if (200 == requestCode) {
for (Object taskResult : taskResults) {
//单张图片的处理结果
int taskCode = ((JSONObject) taskResult).getIntValue("code");
//图片要检测的场景的处理结果, 如果是多个场景,则会有每个场景的结果
JSONArray sceneResults = ((JSONObject) taskResult).getJSONArray("results");
if (200 == taskCode) {
for (Object sceneResult : sceneResults) {
String scene = ((JSONObject) sceneResult).getString("scene");
String label = ((JSONObject) sceneResult).getString("label");
String suggestion = ((JSONObject) sceneResult).getString("suggestion");
//根据scene和suggetion做相关处理
//do something
System.out.println("scene = [" + scene + "]");
System.out.println("suggestion = [" + suggestion + "]");
System.out.println("suggestion = [" + label + "]");
if (!suggestion.equals("pass")) {
resultMap.put("suggestion", suggestion);
resultMap.put("label", label);
return resultMap;
}
}
} else {
//单张图片处理失败, 原因视具体的情况详细分析
System.out.println("task process fail. task response:" + JSON.toJSONString(taskResult));
return null;
}
}
resultMap.put("suggestion","pass");
return resultMap;
} else {
/**
* 表明请求整体处理失败,原因视具体的情况详细分析
*/
System.out.println("the whole image scan request failed. response:" + JSON.toJSONString(scrResponse));
return null;
}
}
return null;
}
}
③. 本地图片检测
新建 heima-leadnews-common/src/main/java/com/heima/common/aliyun/util/ClientUploader.java
文件:
/**
* 用于本地图片文件检测时,上传本地图片
*/
public class ClientUploader {
private IClientProfile profile;
private volatile UploadCredentials uploadCredentials;
private Map<String, String> headers;
private String prefix;
private boolean internal = false;
private Object lock = new Object();
private ClientUploader(IClientProfile profile, String prefix, boolean internal) {
this.profile = profile;
this.uploadCredentials = null;
this.headers = new HashMap<String, String>();
this.prefix = prefix;
this.internal = internal;
}
public static ClientUploader getImageClientUploader(IClientProfile profile, boolean internal){
return new ClientUploader(profile, "images", internal);
}
public static ClientUploader getVideoClientUploader(IClientProfile profile, boolean internal){
return new ClientUploader(profile, "videos", internal);
}
public static ClientUploader getVoiceClientUploader(IClientProfile profile, boolean internal){
return new ClientUploader(profile, "voices", internal);
}
public static ClientUploader getFileClientUploader(IClientProfile profile, boolean internal){
return new ClientUploader(profile, "files", internal);
}
/**
* 上传并获取上传后的图片链接
* @param filePath
* @return
*/
public String uploadFile(String filePath){
FileInputStream inputStream = null;
OSSClient ossClient = null;
try {
File file = new File(filePath);
UploadCredentials uploadCredentials = getCredentials();
if(uploadCredentials == null){
throw new RuntimeException("can not get upload credentials");
}
ObjectMetadata meta = new ObjectMetadata();
meta.setContentLength(file.length());
inputStream = new FileInputStream(file);
ossClient = new OSSClient(getOssEndpoint(uploadCredentials), uploadCredentials.getAccessKeyId(), uploadCredentials.getAccessKeySecret(), uploadCredentials.getSecurityToken());
String object = uploadCredentials.getUploadFolder() + '/' + this.prefix + '/' + String.valueOf(filePath.hashCode());
PutObjectResult ret = ossClient.putObject(uploadCredentials.getUploadBucket(), object, inputStream, meta);
return "oss://" + uploadCredentials.getUploadBucket() + "/" + object;
} catch (Exception e) {
throw new RuntimeException("upload file fail.", e);
} finally {
if(ossClient != null){
ossClient.shutdown();
}
if(inputStream != null){
try {
inputStream.close();
}catch (Exception e){
}
}
}
}
private String getOssEndpoint(UploadCredentials uploadCredentials){
if(this.internal){
return uploadCredentials.getOssInternalEndpoint();
}else{
return uploadCredentials.getOssEndpoint();
}
}
/**
* 上传并获取上传后的图片链接
* @param bytes
* @return
*/
public String uploadBytes(byte[] bytes){
OSSClient ossClient = null;
try {
UploadCredentials uploadCredentials = getCredentials();
if(uploadCredentials == null){
throw new RuntimeException("can not get upload credentials");
}
ossClient = new OSSClient(getOssEndpoint(uploadCredentials), uploadCredentials.getAccessKeyId(), uploadCredentials.getAccessKeySecret(), uploadCredentials.getSecurityToken());
String object = uploadCredentials.getUploadFolder() + '/' + this.prefix + '/' + UUID.randomUUID().toString();
PutObjectResult ret = ossClient.putObject(uploadCredentials.getUploadBucket(), object, new ByteArrayInputStream(bytes));
return "oss://" + uploadCredentials.getUploadBucket() + "/" + object;
} catch (Exception e) {
throw new RuntimeException("upload file fail.", e);
} finally {
if(ossClient != null){
ossClient.shutdown();
}
}
}
public void addHeader(String key, String value){
this.headers.put(key, value);
}
private UploadCredentials getCredentials() throws Exception{
if(this.uploadCredentials == null || this.uploadCredentials.getExpiredTime() < System.currentTimeMillis()){
synchronized(lock){
if(this.uploadCredentials == null || this.uploadCredentials.getExpiredTime() < System.currentTimeMillis()){
this.uploadCredentials = getCredentialsFromServer();
}
}
}
return this.uploadCredentials;
}
/**
* 从服务器端获取上传凭证
* @return
* @throws Exception
*/
private UploadCredentials getCredentialsFromServer() throws Exception{
UploadCredentialsRequest uploadCredentialsRequest = new UploadCredentialsRequest();
uploadCredentialsRequest.setAcceptFormat(FormatType.JSON); // 指定api返回格式
uploadCredentialsRequest.setMethod(com.aliyuncs.http.MethodType.POST); // 指定请求方法
uploadCredentialsRequest.setEncoding("utf-8");
uploadCredentialsRequest.setProtocol(ProtocolType.HTTP);
for (Map.Entry<String, String> kv : this.headers.entrySet()) {
uploadCredentialsRequest.putHeaderParameter(kv.getKey(), kv.getValue());
}
uploadCredentialsRequest.setHttpContent(new JSONObject().toJSONString().getBytes("UTF-8"), "UTF-8", FormatType.JSON);
IAcsClient client = null;
try{
client = new DefaultAcsClient(profile);
HttpResponse httpResponse = client.doAction(uploadCredentialsRequest);
if (httpResponse.isSuccess()) {
JSONObject scrResponse = JSON.parseObject(new String(httpResponse.getHttpContent(), "UTF-8"));
if (200 == scrResponse.getInteger("code")) {
JSONObject data = scrResponse.getJSONObject("data");
return new UploadCredentials(data.getString("accessKeyId"), data.getString("accessKeySecret"),
data.getString("securityToken"), data.getLongValue("expiredTime"),
data.getString("ossEndpoint"), data.getString("ossInternalEndpoint"), data.getString("uploadBucket"), data.getString("uploadFolder"));
}
String requestId = scrResponse.getString("requestId");
throw new RuntimeException("get upload credential from server fail. requestId:" + requestId + ", code:" + scrResponse.getInteger("code"));
}
throw new RuntimeException("get upload credential from server fail. http response status:" + httpResponse.getStatus());
}finally {
client.shutdown();
}
}
}
④. 自定义图库上传图片
新建 heima-leadnews-common/src/main/java/com/heima/common/aliyun/util/CustomLibUploader.java
文件:
/**
* 用于自定义图库上传图片
*/
public class CustomLibUploader {
public String uploadFile(String host, String uploadFolder, String ossAccessKeyId,
String policy, String signature,
String filepath) throws Exception {
LinkedHashMap<String, String> textMap = new LinkedHashMap<String, String>();
// key
String objectName = uploadFolder + "/imglib_" + UUID.randomUUID().toString() + ".jpg";
textMap.put("key", objectName);
// Content-Disposition
textMap.put("Content-Disposition", "attachment;filename="+filepath);
// OSSAccessKeyId
textMap.put("OSSAccessKeyId", ossAccessKeyId);
// policy
textMap.put("policy", policy);
// Signature
textMap.put("Signature", signature);
Map<String, String> fileMap = new HashMap<String, String>();
fileMap.put("file", filepath);
String ret = formUpload(host, textMap, fileMap);
System.out.println("[" + host + "] post_object:" + objectName);
System.out.println("post reponse:" + ret);
return objectName;
}
private static String formUpload(String urlStr, Map<String, String> textMap, Map<String, String> fileMap) throws Exception {
String res = "";
HttpURLConnection conn = null;
String BOUNDARY = "9431149156168";
try {
URL url = new URL(urlStr);
conn = (HttpURLConnection) url.openConnection();
conn.setConnectTimeout(5000);
conn.setReadTimeout(10000);
conn.setDoOutput(true);
conn.setDoInput(true);
conn.setRequestMethod("POST");
conn.setRequestProperty("User-Agent",
"Mozilla/5.0 (Windows; U; Windows NT 6.1; zh-CN; rv:1.9.2.6)");
conn.setRequestProperty("Content-Type",
"multipart/form-data; boundary=" + BOUNDARY);
OutputStream out = new DataOutputStream(conn.getOutputStream());
// text
if (textMap != null) {
StringBuffer strBuf = new StringBuffer();
Iterator iter = textMap.entrySet().iterator();
int i = 0;
while (iter.hasNext()) {
Map.Entry entry = (Map.Entry) iter.next();
String inputName = (String) entry.getKey();
String inputValue = (String) entry.getValue();
if (inputValue == null) {
continue;
}
if (i == 0) {
strBuf.append("--").append(BOUNDARY).append(
"\r\n");
strBuf.append("Content-Disposition: form-data; name=\""
+ inputName + "\"\r\n\r\n");
strBuf.append(inputValue);
} else {
strBuf.append("\r\n").append("--").append(BOUNDARY).append(
"\r\n");
strBuf.append("Content-Disposition: form-data; name=\""
+ inputName + "\"\r\n\r\n");
strBuf.append(inputValue);
}
i++;
}
out.write(strBuf.toString().getBytes());
}
// file
if (fileMap != null) {
Iterator iter = fileMap.entrySet().iterator();
while (iter.hasNext()) {
Map.Entry entry = (Map.Entry) iter.next();
String inputName = (String) entry.getKey();
String inputValue = (String) entry.getValue();
if (inputValue == null) {
continue;
}
File file = new File(inputValue);
String filename = file.getName();
String contentType = new MimetypesFileTypeMap().getContentType(file);
if (contentType == null || contentType.equals("")) {
contentType = "application/octet-stream";
}
StringBuffer strBuf = new StringBuffer();
strBuf.append("\r\n").append("--").append(BOUNDARY).append(
"\r\n");
strBuf.append("Content-Disposition: form-data; name=\""
+ inputName + "\"; filename=\"" + filename
+ "\"\r\n");
strBuf.append("Content-Type: " + contentType + "\r\n\r\n");
out.write(strBuf.toString().getBytes());
DataInputStream in = new DataInputStream(new FileInputStream(file));
int bytes = 0;
byte[] bufferOut = new byte[1024];
while ((bytes = in.read(bufferOut)) != -1) {
out.write(bufferOut, 0, bytes);
}
in.close();
}
StringBuffer strBuf = new StringBuffer();
out.write(strBuf.toString().getBytes());
}
byte[] endData = ("\r\n--" + BOUNDARY + "--\r\n").getBytes();
out.write(endData);
out.flush();
out.close();
// 读取返回数据
StringBuffer strBuf = new StringBuffer();
BufferedReader reader = new BufferedReader(new InputStreamReader(
conn.getInputStream()));
String line = null;
while ((line = reader.readLine()) != null) {
strBuf.append(line).append("\n");
}
res = strBuf.toString();
reader.close();
reader = null;
} catch (Exception e) {
System.err.println("发送POST请求出错: " + urlStr);
throw e;
} finally {
if (conn != null) {
conn.disconnect();
conn = null;
}
}
return res;
}
}
⑤. 上传凭证
新建 heima-leadnews-common/src/main/java/com/heima/common/aliyun/util/UploadCredentials.java
文件:
public class UploadCredentials implements Serializable {
private String accessKeyId;
private String accessKeySecret;
private String securityToken;
private Long expiredTime;
private String ossEndpoint;
private String ossInternalEndpoint;
private String uploadBucket;
private String uploadFolder;
public UploadCredentials(String accessKeyId, String accessKeySecret, String securityToken, Long expiredTime, String ossEndpoint, String ossInternalEndpoint, String uploadBucket, String uploadFolder) {
this.accessKeyId = accessKeyId;
this.accessKeySecret = accessKeySecret;
this.securityToken = securityToken;
this.expiredTime = expiredTime;
this.ossEndpoint = ossEndpoint;
this.ossInternalEndpoint = ossInternalEndpoint;
this.uploadBucket = uploadBucket;
this.uploadFolder = uploadFolder;
}
public String getAccessKeyId() {
return accessKeyId;
}
public void setAccessKeyId(String accessKeyId) {
this.accessKeyId = accessKeyId;
}
public String getAccessKeySecret() {
return accessKeySecret;
}
public void setAccessKeySecret(String accessKeySecret) {
this.accessKeySecret = accessKeySecret;
}
public String getSecurityToken() {
return securityToken;
}
public void setSecurityToken(String securityToken) {
this.securityToken = securityToken;
}
public Long getExpiredTime() {
return expiredTime;
}
public void setExpiredTime(Long expiredTime) {
this.expiredTime = expiredTime;
}
public String getOssEndpoint() {
return ossEndpoint;
}
public void setOssEndpoint(String ossEndpoint) {
this.ossEndpoint = ossEndpoint;
}
public String getUploadBucket() {
return uploadBucket;
}
public void setUploadBucket(String uploadBucket) {
this.uploadBucket = uploadBucket;
}
public String getUploadFolder() {
return uploadFolder;
}
public void setUploadFolder(String uploadFolder) {
this.uploadFolder = uploadFolder;
}
public String getOssInternalEndpoint() {
return ossInternalEndpoint;
}
public void setOssInternalEndpoint(String ossInternalEndpoint) {
this.ossInternalEndpoint = ossInternalEndpoint;
}
}
⑵. 阿里云AK
编辑 leadnews-wemedia
配置:
aliyun:
accessKeyId: LTAI5tCWHCcfvqQzu8k2oKmX #阿里云账号ID
secret: auoKUFsghimbfVQHpy7gtRyBkoR4vc #阿里云账号secret
#aliyun.scenes=porn,terrorism,ad,qrcode,live,logo
scenes: terrorism
⑶. 自动配置
编辑 heima-leadnews-common/src/main/resources/META-INF/spring.factories
文件:
com.heima.common.aliyun.GreenImageScan,\
com.heima.common.aliyun.GreenTextScan
⑷. 测试类
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/test/java/com/heima/wemedia/test/AliyunTest.java
文件:
@SpringBootTest(classes = WemediaApplication.class)
@RunWith(SpringRunner.class)
public class AliyunTest {
@Autowired
private GreenTextScan greenTextScan;
@Autowired
private GreenImageScan greenImageScan;
@Autowired
private FileStorageService fileStorageService;
/**
* 测试文本内容审核
*/
@Test
public void testScanText() throws Exception {
// Map map = greenTextScan.greeTextScan("我心中有团火, 可不能让他熄灭");
Map map = greenTextScan.greeTextScan("海洛因"); // 非法文字测试
System.out.println(map);
}
/**
* 测试图片审核
*/
@Test
public void testScanImage() throws Exception {
byte[] bytes = fileStorageService.downLoadFile("http://192.168.200.130:9000/leadnews/2023/01/07/82b9365799184731a5002d47560b2b23.png");
List<byte []> list = new ArrayList<>();
list.add(bytes);
Map map = greenImageScan.imageScan(list);
System.out.println(map);
}
}
(博主阿里云账号未开通内容审核)
二、App端文章保存
1. 表结构说明
ap_article 文章信息表
ap_article_config 文章配置表
ap_article_content 文章内容表
2. 分布式ID
⑴. 存在的问题
随着业务的增长,文章表可能要占用很大的物理存储空间,为了解决该问题,后期使用数据库分片技术。将一个数据库进行拆分,通过数据库中间件连接。如果数据库中该表选用ID自增策略,则可能产生重复的ID,此时应该使用分布式ID生成策略来生成ID。
⑵. 技术选型
方案 | 优势 | 劣势 |
---|---|---|
redis | (INCR)生成一个全局连续递增 的数字类型主键 | 增加了一个外部组件的依赖,Redis不可用,则整个数据库将无法在插入 |
UUID | 全局唯一,Mysql也有UUID实现 | 36个字符组成,占用空间大 |
snowflake算法 | 全局唯一 ,数字类型,存储成本低 | 机器规模大于1024台无法支持 |
⑶. 雪花算法
snowflake是Twitter开源的分布式ID生成算法,结果是一个long型的ID。
其核心思想是:使用41bit作为毫秒数,10bit作为机器的ID(5个bit是数据中心,5个bit的机器ID),12bit作为毫秒内的流水号(意味着每个节点在每毫秒可以产生 4096 个 ID),最后还有一个符号位,永远是0
文章端相关的表都使用雪花算法生成id,包括ap_article、 ap_article_config、 ap_article_content
⑷. Nacos配置
配置数据中心id和机器id,编辑 leadnews-article
配置:
mybatis-plus:
mapper-locations: classpath*:mapper/*.xml
# 设置别名包扫描路径,通过该属性可以给包中的类注册别名
type-aliases-package: com.heima.model.article.pojos
global-config:
datacenter-id: 1
workerId: 1
datacenter-id:数据中心id(取值范围:0-31)、workerId:机器id(取值范围:0-31)
3. 接口实现
⑴. 需求分析
在文章审核成功以后需要在app的article库中新增文章数据
- 保存文章信息 ap_article
- 保存文章配置信息 ap_article_config
- 保存文章内容 ap_article_content
⑵. feign接口
①. 接口说明
说明 | |
---|---|
接口路径 | /api/v1/article/save |
请求方式 | POST |
参数 | ArticleDto |
响应结果 | ResponseResult |
②. ArticleDto
新建 heima-leadnews-model/src/main/java/com/heima/model/article/dtos/ArticleDto.java
文件:
@Data
public class ArticleDto extends ApArticle {
/**
* 文章内容
*/
private String content;
}
③. ResponseResult
成功:
{
"code": 200,
"errorMessage" : "操作成功",
"data":"1302864436297442242"
}
失败:
{
"code":501,
"errorMessage":"参数失效",
}
{
"code":501,
"errorMessage":"文章没有找到",
}
⑶. 引入依赖
编辑 heima-leadnews-feign-api/pom.xml
文件:
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-openfeign</artifactId>
</dependency>
⑷. 定义接口
新建 heima-leadnews-feign-api/src/main/java/com/heima/apis/article/IArticleClient.java
文件:
@FeignClient("leadnews-article")
public interface IArticleClient {
@PostMapping("/api/v1/article/save")
public ResponseResult saveArticle(@RequestBody ArticleDto dto);
}
⑸. 实现接口
新建 heima-leadnews-feign-api/src/main/java/com/heima/apis/article/IArticleClient.java
文件:
@FeignClient("leadnews-article")
public interface IArticleClient {
@PostMapping("/api/v1/article/save")
public ResponseResult saveArticle(@RequestBody ArticleDto dto);
}
⑹. Mapper
新建 heima-leadnews-feign-api/src/main/java/com/heima/apis/article/IArticleClient.java
文件:
@FeignClient("leadnews-article")
public interface IArticleClient {
@PostMapping("/api/v1/article/save")
public ResponseResult saveArticle(@RequestBody ArticleDto dto);
}
⑺. Service
编辑 heima-leadnews-service/heima-leadnews-article/src/main/java/com/heima/article/service/ApArticleService.java
文件:
/**
* 保存app端相关文章
* @param dto
* @return
*/
public ResponseResult saveArticle(ArticleDto dto);
⑻. ApArticleConfig
新建 heima-leadnews-model/src/main/java/com/heima/model/article/pojos/ApArticleConfig.java
文件:
@Data
@NoArgsConstructor
@TableName("ap_article_config")
public class ApArticleConfig implements Serializable {
// 有参构造函数
public ApArticleConfig(Long articleId) {
this.articleId = articleId;
this.isDown = false;
this.isDelete = false;
this.isComment = true;
this.isForward = true;
}
⑼. ServiceImpl
新建 heima-leadnews-service/heima-leadnews-article/src/main/java/com/heima/article/service/impl/ApArticleServiceImpl.java
文件:
@Service
@Transactional
@Slf4j
public class ApArticleServiceImpl extends ServiceImpl<ApArticleMapper, ApArticle> implements ApArticleService {
@Autowired
private ApArticleMapper apArticleMapper;
@Autowired
private ApArticleConfigMapper apArticleConfigMapper;
@Autowired
private ApArticleContentMapper apArticleContentMapper;
// 单页最大加载的数字
private final static short Max_PAGE_SIZE = 50;
/**
* 加载文章列表
* @param dto
* @param type 1 加载更多 2 加载最新
* @return
*/
@Override
public ResponseResult load(ArticleHomeDto dto, Short type) {
// 1.参数校验
// 1.1 分页参数校验
Integer size = dto.getSize();
if( size == null || size == 0) {
size = 10;
}
// 分页的值不超过50
size = Math.min(size, Max_PAGE_SIZE);
dto.setSize(size);
// 1.2 类型参数校验(更多/最新)
if(!type.equals(ArticleConstants.LOADTYPE_LOAD_MORE) && !type.equals(ArticleConstants.LOADTYPE_LOAD_NEW)) {
type = ArticleConstants.LOADTYPE_LOAD_MORE;
}
// 1.3 频道参数校验
if(StringUtils.isBlank(dto.getTag())) {
dto.setTag(ArticleConstants.DEFAULT_TAG);
}
// 1.4 时间参数校验
if(dto.getMaxBehotTime() == null) dto.setMaxBehotTime(new Date());
if(dto.getMinBehotTime() == null) dto.setMinBehotTime(new Date());
// 2. 查询
List<ApArticle> articleList = apArticleMapper.loadArticleList(dto, type);
// 3. 结果返回
return ResponseResult.okResult(articleList);
}
/**
* 保存app端相关文章
* @param dto
* @return
*/
@Override
public ResponseResult saveArticle(ArticleDto dto) {
// 1. 检查参数
if(dto == null) {
return ResponseResult.errorResult(AppHttpCodeEnum.PARAM_INVALID);
}
// 2. 判断 id 是否存在
ApArticle apArticle = new ApArticle();
BeanUtils.copyProperties(dto, apArticle);
if(dto.getId() == null) {
// 2.1 id不存在, 保存 文章基本信息/文章内容/文章配置
// 保存文章
save(apArticle);
// 保存文章配置
ApArticleConfig apArticleConfig = new ApArticleConfig(apArticle.getId());
apArticleConfigMapper.insert(apArticleConfig);
// 保存文章内容
ApArticleContent apArticleContent = new ApArticleContent();
apArticleContent.setArticleId(apArticle.getId());
apArticleContent.setContent(dto.getContent());
apArticleContentMapper.insert(apArticleContent);
} else {
// 2.2 id存在, 保存 文章基本信息/文章内容
// 修改文章
updateById(apArticle);
// 修改文章内容
ApArticleContent apArticleContent = apArticleContentMapper.selectOne(Wrappers.<ApArticleContent>lambdaQuery().eq(ApArticleContent::getArticleId, dto.getId()));
apArticleContent.setContent(dto.getContent());
apArticleContentMapper.updateById(apArticleContent);
}
// 3. 返回结果 文章id
return ResponseResult.okResult(apArticle.getId());
}
}
⑽. 定义接口
新建 heima-leadnews-feign-api/src/main/java/com/heima/apis/article/IArticleClient.java
文件:
@RestController
public class ArticleClient implements IArticleClient {
@Autowired
private ApArticleService apArticleService;
@PostMapping("/api/v1/article/save")
@Override
public ResponseResult saveArticle(@RequestBody ArticleDto dto) {
return apArticleService.saveArticle(dto);
}
}
⑾. Postman
①. 保存文章
{
"title":"新闻头条项目",
"authoId":1102,
"layout":1,
"labels":"新闻头条",
"publishTime":"2028-03-14T11:35:49.000Z",
"images": "https://up.enterdesk.com/edpic/c7/13/aa/c713aaa432c4e86e79178bc8d72ba487.jpg",
"content":"新闻头条项目背景,新闻头条项目背景,新闻头条项目背景,新闻头条项目背景,新闻头条项目背景"
}
②. 修改文章
{
"id":1390209114747047938,
"title":"新闻头条项目",
"authoId":1102,
"layout":1,
"labels":"新闻头条",
"publishTime":"2028-03-14T11:35:49.000Z",
"images": "https://up.enterdesk.com/edpic/c7/13/aa/c713aaa432c4e86e79178bc8d72ba487.jpg",
"content":"新闻头条项目背景,新闻头条项目背景,新闻头条项目背景,新闻头条项目背景,新闻头条项目背景"
}
⑿. 踩坑经验
提示
apache-maven-3.6.1\repository_new\org\aspectj\aspectjweaver\1.9.6\aspectjweaver-1.9.6.jar
包错误
资源链接: https://pan.baidu.com/s/1BgQg5dKv9IYw8w3791p4yA?pwd=abcd
使用下载的jar包替换 现有的错误jar包
三、文章审核
1. 方法定义
⑴. service
新建 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/service/WmNewsAutoScanService.java
文件:
public interface WmNewsAutoScanService {
/**
* 自媒体文章审核
* @param id 文章id
*/
public void autoScanWmNews(Integer id);
}
⑵. feign远程调用
新建 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/WemediaApplication.java
文件:
@EnableFeignClients(basePackages = "com.heima.apis")
⑶. serviceImpl
新建 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/service/impl/WmNewsAutoScanServiceImpl.java
文件:
@Service
@Slf4j
@Transactional
public class WmNewsAutoScanServiceImpl implements WmNewsAutoScanService {
@Autowired(required=false)
private WmNewsMapper wmNewsMapper;
@Autowired(required = false)
private GreenTextScan greenTextScan;
@Autowired
private FileStorageService fileStorageService;
@Autowired
private GreenImageScan greenImageScan;
@Autowired
private IArticleClient articleClient;
@Autowired(required = false)
private WmChannelMapper wmChannelMapper;
@Autowired
private WmUserMapper wmUserMapper;
/**
* 自媒体文章审核
* @param id 文章id
*/
@Override
public void autoScanWmNews(Integer id) {
// 1. 查询文章
WmNews wmNews = wmNewsMapper.selectById(id);
if(wmNews == null) throw new RuntimeException("WmNewsAutoScanServiceImpl - 文章不存在");
// 判断是否是提交提交状态
if(wmNews.getStatus().equals(WmNews.Status.SUBMIT.getCode())) {
// 从文章中提取图片和文本内容
Map<String, Object> textAndImages = handleTextAndImages(wmNews);
// 2. 审核文本
Boolean isTextScan = handleTextScan((String) textAndImages.get("content"), wmNews);
if(!isTextScan) return;
// 3. 审核图片
boolean isImageScan = handleImageScan((List<String>) textAndImages.get("images"), wmNews);
if(!isImageScan) return;
// 4. 保存文章
ResponseResult responseResult = saveAppArticle(wmNews);
if(!responseResult.getCode().equals(200)) {
throw new RuntimeException("WmNewsAutoScanServiceImpl - 文章审核, 保存文章数据失败");
}
// 回填article_id
wmNews.setArticleId((Long) responseResult.getData());
updateWmNews(wmNews, (short) 9, "审核成功");
}
}
/**
* 保存文章
* @param wmNews
*/
private ResponseResult saveAppArticle(WmNews wmNews) {
ArticleDto dto = new ArticleDto();
// 属性拷贝
BeanUtils.copyProperties(wmNews, dto);
// 文章布局
dto.setLayout(wmNews.getType());
// 频道
WmChannel wmChannel = wmChannelMapper.selectById(wmNews.getChannelId());
if(wmChannel != null) dto.setChannelName(wmChannel.getName());
// 作者
dto.setAuthorId(wmNews.getUserId().longValue());
WmUser wmUser = wmUserMapper.selectById(wmNews.getUserId());
if (wmNews != null) dto.setAuthorName(wmUser.getName());
// 设置文章id
if(wmNews.getArticleId() != null) dto.setId(wmNews.getArticleId());
// 创建时间
dto.setCreatedTime(new Date());
ResponseResult responseResult = articleClient.saveArticle(dto);
return responseResult;
}
/**
* 审核图片
* @param images
* @param wmNews
* @return
*/
private boolean handleImageScan(List<String> images, WmNews wmNews) {
boolean flag = true;
if (images == null || images.size() == 0) return flag;
// 1. 下载图片 minIO
// 图片去重
images = images.stream().distinct().collect(Collectors.toList());
List<byte[]> imageList = new ArrayList<>();
for (String image : images) {
byte[] bytes = fileStorageService.downLoadFile(image);
imageList.add(bytes);
}
// 2. 图片审核
try {
Map map = greenImageScan.imageScan(imageList);
if(map != null) {
// 审核失败
if(map.get("suggestion").equals("block")) {
flag = false;
updateWmNews(wmNews, (short) 2, "当前文件中,存在违规内容");
}
// 不确定信息 需要人工审核
if(map.get("suggestion").equals("review")) {
flag = false;
updateWmNews(wmNews, (short) 3, "当前文件中,存在不确定内容");
}
}
} catch (Exception e) {
flag = false;
e.printStackTrace();
}
return flag;
}
/**
* 审核纯文本内容
* @param content
* @param wmNews
* @return
*/
private Boolean handleTextScan(String content, WmNews wmNews) {
boolean flag = true;
if((wmNews.getTitle() + "-" + content).length() == 0) return flag;
try {
Map map = greenTextScan.greeTextScan(wmNews.getTitle() + "-" + content);
if(map != null) {
// 审核失败
if(map.get("suggestion").equals("block")) {
flag = false;
updateWmNews(wmNews, (short) 2, "当前文件中,存在违规内容");
}
// 不确定信息 需要人工审核
if(map.get("suggestion").equals("review")) {
flag = false;
updateWmNews(wmNews, (short) 3, "当前文件中,存在不确定内容");
}
}
} catch (Exception e) {
flag = false;
e.printStackTrace();
}
return flag;
}
/**
* 修改文章内容
* @param wmNews
* @param status
* @param reason
*/
private void updateWmNews(WmNews wmNews, short status, String reason) {
wmNews.setStatus(status);
wmNews.setReason(reason);
wmNewsMapper.updateById(wmNews);
}
/**
* 1. 提取文章内容中的图片和文本
* 2. 提取文章的封面图片
* @param wmNews
* @return
*/
private Map<String, Object> handleTextAndImages(WmNews wmNews) {
// 储存纯文本内容
StringBuilder stringBuilder = new StringBuilder();
// 储存图片
List<String> images = new ArrayList<>();
// 1. 提取文章中内容和图片
if (StringUtils.isNotBlank(wmNews.getContent())) {
List<Map> maps = JSONArray.parseArray(wmNews.getContent(), Map.class);
for (Map map: maps) {
if (map.get("type").equals("text")) {
stringBuilder.append(map.get("value"));
}
if (map.get("type").equals("text")) {
images.add((String) map.get("image"));
}
}
}
// 2. 提取文章的封面图片
if(StringUtils.isNotBlank(wmNews.getImages())) {
String[] split = wmNews.getImages().split(",");
images.addAll(Arrays.asList(split));
}
// 3. 返回结果
Map<String, Object> resultMap = new HashMap<>();
resultMap.put("content", stringBuilder.toString());
resultMap.put("images", images);
return resultMap;
}
}
⑷. 测试类
新建 heima-leadnews-service/heima-leadnews-wemedia/src/test/java/com/heima/wemedia/service/WmNewsAutoScanServiceTest.java
文件:
@SpringBootTest(classes = WemediaApplication.class)
@RunWith(SpringRunner.class)
public class WmNewsAutoScanServiceTest {
@Autowired
private WmNewsAutoScanService wmNewsAutoScanService;
@Test
public void autoScanWmNews() {
wmNewsAutoScanService.autoScanWmNews(6236);
}
}
2. feign调用服务降级
⑴. 实现类
新建 heima-leadnews-feign-api/src/main/java/com/heima/apis/article/fallback/IArticleClientFallback.java
文件:
@Component
public class IArticleClientFallback implements IArticleClient {
@Override
public ResponseResult saveArticle(ArticleDto dto) {
return ResponseResult.errorResult(AppHttpCodeEnum.SERVER_ERROR, "获取数据失败");
}
}
⑵. 远程接口指向
编辑 heima-leadnews-feign-api/src/main/java/com/heima/apis/article/IArticleClient.java
文件:
@FeignClient(value = "leadnews-article", fallback = IArticleClientFallback.class)
public interface IArticleClient {
@PostMapping("/api/v1/article/save")
public ResponseResult saveArticle(@RequestBody ArticleDto dto);
}
⑶. 扫包
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/config/InitConfig.java
文件:
@Configuration
@ComponentScan("com.heima.apis.article.fallback")
public class InitConfig {
}
⑷. Nacos配置
编辑 leadnews-wemedia
配置:
feign:
# 开启feign对hystrix熔断降级的支持
hystrix:
enabled: true
# 修改调用超时时间
client:
config:
default:
connectTimeout: 2000
readTimeout: 2000
⑸. 测试
①. 添加超时
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/service/impl/WmNewsAutoScanServiceImpl.java
文件:
/**
* 保存文章
* @param wmNews
*/
private ResponseResult saveAppArticle(WmNews wmNews) {
// 超时3秒测试
try {
Thread.sleep(3000);
} catch (InterruptedException e) {
e.printStackTrace();
}
②. 测试类
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/test/java/com/heima/wemedia/service/WmNewsAutoScanServiceTest.java
文件:
@SpringBootTest(classes = WemediaApplication.class)
@RunWith(SpringRunner.class)
public class WmNewsAutoScanServiceTest {
@Autowired
private WmNewsAutoScanService wmNewsAutoScanService;
@Test
public void autoScanWmNews() {
wmNewsAutoScanService.autoScanWmNews(6235);
}
}
3. 异步线程调用
⑴. 方法注解
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/service/impl/WmNewsAutoScanServiceImpl.java
文件:
/**
* 自媒体文章审核
* @param id 文章id
*/
@Override
@Async // 标明当前方法是异步方法
public void autoScanWmNews(Integer id) {
⑵. 调用
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/service/impl/WmNewsServiceImpl.java
文件:
// 5. 保存文章封面图片和素材的关系, 如果当然布局是自动, 需要匹配封面图片
saveRelativeInfoForCover(dto, wmNews, materials);
// 审核文章
wmNewsAutoScanService.autoScanWmNews(wmNews.getId());
return ResponseResult.okResult(AppHttpCodeEnum.SUCCESS);
}
⑶. 引导类注解
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/WemediaApplication.java
文件:
@EnableAsync // 开启异步调用
4. 综合测试
启动nginx,启动article微服务、wemedia微服务、自媒体网关,自媒体地址: http://localhost:8802/#/login
四、审核拓展
1. 自管理敏感词
需要自己维护一套敏感词,在文章审核的时候,需要验证文章是否包含这些敏感词
⑴. 技术选型
方案 | 说明 |
---|---|
数据库模糊查询 | 效率太低 |
String.indexOf(“”)查找 | 数据库量大的话也是比较慢 |
全文检索 | 分词再匹配 |
DFA算法 | 确定有穷自动机(一种数据结构) |
DFA实现原理:
DFA全称为:Deterministic Finite Automaton,即确定有穷自动机。
存储:一次性的把所有的敏感词存储到了多个map中,就是下图表示这种结构
敏感词:xx、xx、大坏蛋
⑵. 库表
sql链接: https://pan.baidu.com/s/1zrWnneKDZ_zyA4HJAcA30Q?pwd=abcd
⑶. 实体类
新建 heima-leadnews-model/src/main/java/com/heima/model/wemedia/pojos/WmSensitive.java
文件:
/**
* <p>
* 敏感词信息表
* </p>
*
* @author itheima
*/
@Data
@TableName("wm_sensitive")
public class WmSensitive implements Serializable {
private static final long serialVersionUID = 1L;
/**
* 主键
*/
@TableId(value = "id", type = IdType.AUTO)
private Integer id;
/**
* 敏感词
*/
@TableField("sensitives")
private String sensitives;
/**
* 创建时间
*/
@TableField("created_time")
private Date createdTime;
}
⑷. Mapper
新建 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/mapper/WmSensitiveMapper.java
文件:
@Mapper
public interface WmUserMapper extends BaseMapper<WmUser> {
}
⑸. ServiceImpl
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/service/impl/WmNewsAutoScanServiceImpl.java
文件:
@Autowired
private WmSensitiveMapper wmSensitiveMapper;
/**
* 自媒体文章审核
* @param id 文章id
*/
@Override
@Async // 标明当前方法是异步方法
public void autoScanWmNews(Integer id) {
// 1. 查询文章
WmNews wmNews = wmNewsMapper.selectById(id);
if(wmNews == null) throw new RuntimeException("WmNewsAutoScanServiceImpl - 文章不存在");
// 判断是否是提交提交状态
if(wmNews.getStatus().equals(WmNews.Status.SUBMIT.getCode())) {
// 从文章中提取图片和文本内容
Map<String, Object> textAndImages = handleTextAndImages(wmNews);
// 自管理的敏感词过滤
boolean isSensitive = handleSensitiveScan((String) textAndImages.get("content"), wmNews);
if(!isSensitive) return;
// 2. 审核文本
Boolean isTextScan = handleTextScan((String) textAndImages.get("content"), wmNews);
if(!isTextScan) return;
// 3. 审核图片
boolean isImageScan = handleImageScan((List<String>) textAndImages.get("images"), wmNews);
if(!isImageScan) return;
// 4. 保存文章
ResponseResult responseResult = saveAppArticle(wmNews);
if(!responseResult.getCode().equals(200)) {
throw new RuntimeException("WmNewsAutoScanServiceImpl - 文章审核, 保存文章数据失败");
}
// 回填article_id
wmNews.setArticleId((Long) responseResult.getData());
updateWmNews(wmNews, (short) 9, "审核成功");
}
}
/**
* 自管理的敏感词过滤
* @param content
* @param wmNews
* @return
*/
private boolean handleSensitiveScan(String content, WmNews wmNews) {
boolean flag = true;
// 1. 获取所有敏感词
List<WmSensitive> wmSensitives = wmSensitiveMapper.selectList(Wrappers.<WmSensitive>lambdaQuery().select(WmSensitive::getSensitives));
List<String> sensitiveList = wmSensitives.stream().map(WmSensitive::getSensitives).collect(Collectors.toList());
// 2. 初始化敏感词库
SensitiveWordUtil.initMap(sensitiveList);
// 3. 查询文章是否包含敏感词
Map<String, Integer> map = SensitiveWordUtil.matchWords(content);
if(map.size() > 0) {
flag = false;
updateWmNews(wmNews, (short)2, "当前文章存在违规内容" + map);
}
return flag;
}
2. 图片文字识别
⑴. OCR
OCR (Optical Character Recognition,光学字符识别)是指电子设备(例如扫描仪或数码相机)检查纸上打印的字符,通过检测暗、亮的模式确定其形状,然后用字符识别方法将形状翻译成计算机文字的过程
方案 | 说明 |
---|---|
百度OCR | 收费 |
Tesseract-OCR | Google维护的开源OCR引擎,支持Java,Python等语言调用 |
Tess4J | 封装了Tesseract-OCR ,支持Java调用 |
⑵. 入门案例
①. 添加模块
②. pom
编辑 heima-leadnews-test/tess4j-demo/pom.xml
文件:
<dependencies>
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
<version>4.1.1</version>
</dependency>
</dependencies>
③. 字体库
中文字体库链接: https://pan.baidu.com/s/1VYe1RKuqxJJUsLkV5ZSDIw?pwd=abcd
存放到项目 nginx 同目录中
④. Application
新建 heima-leadnews-test/tess4j-demo/src/main/java/com/heima/tess4j/Application.java
文件:
public class Application {
public static void main(String[] args) throws TesseractException {
// 创建实例
ITesseract tesseract = new Tesseract();
// 设置字体库路径
tesseract.setDatapath("D:\\code\\hm\\leadnews\\config\\tessdata"); // 根据字体库地址配置
// 设置语言 -> 简体中文
tesseract.setLanguage("chi_sim");
File file = new File("D:\\code\\hm\\leadnews\\config\\img\\ocr.png"); // 含有敏感文字的图片
String result = tesseract.doOCR(file);
System.out.println("识别的结果为: " + result.replaceAll("\\r|\\n", "-"));
}
}
⑶. 图片识别工具类
①. pom配置
新建 heima-leadnews-common/pom.xml
文件:
<!--tess4j-->
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
<version>4.1.1</version>
</dependency>
②. 工具类
新建 heima-leadnews-common/src/main/java/com/heima/common/tess4j/Tess4jClient.java
文件:
@Getter
@Setter
@Component
@ConfigurationProperties(prefix = "tess4j")
public class Tess4jClient {
private String dataPath;
private String language;
public String doOCR(BufferedImage image) throws TesseractException {
//创建Tesseract对象
ITesseract tesseract = new Tesseract();
//设置字体库路径
tesseract.setDatapath(dataPath);
//中文识别
tesseract.setLanguage(language);
//执行ocr识别
String result = tesseract.doOCR(image);
//替换回车和tal键 使结果为一行
result = result.replaceAll("\\r|\\n", "-").replaceAll(" ", "");
return result;
}
}
③. 自动配置
编辑 heima-leadnews-common/src/main/resources/META-INF/spring.factories
文件:
com.heima.common.tess4j.Tess4jClient
④. tess4j配置
编辑 heima-leadnews-common/src/main/resources/META-INF/spring.factories
文件:
com.heima.common.tess4j.Tess4jClient
⑷. 实现类
编辑 heima-leadnews-service/heima-leadnews-wemedia/src/main/java/com/heima/wemedia/service/impl/WmNewsAutoScanServiceImpl.java
文件:
@Autowired
private Tess4jClient tess4jClient;
/**
* 审核图片
* @param images
* @param wmNews
* @return
*/
private boolean handleImageScan(List<String> images, WmNews wmNews) {
boolean flag = true;
if (images == null || images.size() == 0) return flag;
// 1. 下载图片 minIO
// 图片去重
images = images.stream().distinct().collect(Collectors.toList());
List<byte[]> imageList = new ArrayList<>();
try {
for (String image : images) {
byte[] bytes = fileStorageService.downLoadFile(image);
// 图片识别
// 1. byte[] -> bufferedImage
ByteArrayInputStream in = new ByteArrayInputStream(bytes);
BufferedImage bufferedImage = ImageIO.read(in);
// 2. 图片识别
String result = tess4jClient.doOCR(bufferedImage);
// 3. 过滤文字
boolean isSensitive = handleSensitiveScan(result, wmNews);
if(!isSensitive) return isSensitive;
imageList.add(bytes);
}
}catch (Exception e) {
e.printStackTrace();
}
五、文章详情
1. 静态文件生成
⑴. 思路分析
文章端创建app相关文章时,生成文章详情静态页上传到MinIO中
⑵. Service
编辑 heima-leadnews-service/heima-leadnews-article/src/main/java/com/heima/article/service/ArticleFreemarkerService.java
文件:
public interface ArticleFreemarkerService {
/**
* 生成静态文件上传到MinIO中
* @param apArticle
* @param content
*/
public void buildArticleToMinIO(ApArticle apArticle, String content);
}
⑶. ServiceImpl
编辑 heima-leadnews-service/heima-leadnews-article/src/main/java/com/heima/article/service/impl/ArticleFreemarkerServiceImpl.java
文件:
@Service
@Slf4j
@Transactional
public class ArticleFreemarkerServiceImpl implements ArticleFreemarkerService {
@Autowired
private Configuration configuration;
@Autowired
private FileStorageService fileStorageService;
@Autowired
private ApArticleService apArticleService;
/**
* 上成静态文件上传到MinIO中
* @param apArticle
* @param content
*/
@Override
@Async
public void buildArticleToMinIO(ApArticle apArticle, String content) {
//1 获取文章内容
if(StringUtils.isNotBlank(content)){
//2 文章内容通过freemarker生成html文件
Template template = null;
StringWriter out = new StringWriter();
try {
template = configuration.getTemplate("article.ftl");
//数据模型
Map<String,Object> contentDataModel = new HashMap<>();
contentDataModel.put("content", JSONArray.parseArray(content));
//合成
template.process(contentDataModel,out);
} catch (Exception e) {
e.printStackTrace();
}
//3 把html文件上传到minio中
InputStream in = new ByteArrayInputStream(out.toString().getBytes());
String path = fileStorageService.uploadHtmlFile("", apArticle.getId() + ".html", in);
//4 修改ap_article表,保存static_url字段
apArticleService.update(Wrappers.<ApArticle>lambdaUpdate().eq(ApArticle::getId,apArticle.getId())
.set(ApArticle::getStaticUrl,path));
}
}
}
⑷. 上传MinIO
编辑 heima-leadnews-service/heima-leadnews-article/src/main/java/com/heima/article/service/impl/ApArticleServiceImpl.java
文件:
/**
* 保存app端相关文章
* @param dto
* @return
*/
@Override
public ResponseResult saveArticle(ArticleDto dto) {
// 1. 检查参数
if(dto == null) {
return ResponseResult.errorResult(AppHttpCodeEnum.PARAM_INVALID);
}
// 2. 判断 id 是否存在
ApArticle apArticle = new ApArticle();
BeanUtils.copyProperties(dto, apArticle);
if(dto.getId() == null) {
// 2.1 id不存在, 保存 文章基本信息/文章内容/文章配置
// 保存文章
save(apArticle);
// 保存文章配置
ApArticleConfig apArticleConfig = new ApArticleConfig(apArticle.getId());
apArticleConfigMapper.insert(apArticleConfig);
// 保存文章内容
ApArticleContent apArticleContent = new ApArticleContent();
apArticleContent.setArticleId(apArticle.getId());
apArticleContent.setContent(dto.getContent());
apArticleContentMapper.insert(apArticleContent);
} else {
// 2.2 id存在, 保存 文章基本信息/文章内容
// 修改文章
updateById(apArticle);
// 修改文章内容
ApArticleContent apArticleContent = apArticleContentMapper.selectOne(Wrappers.<ApArticleContent>lambdaQuery().eq(ApArticleContent::getArticleId, dto.getId()));
System.out.println(dto.getContent());
apArticleContent.setContent(dto.getContent());
apArticleContentMapper.updateById(apArticleContent);
}
// 异步调用 生成静态文件上传到minio中
articleFreemarkerService.buildArticleToMinIO(apArticle, dto.getContent());
// 3. 返回结果 文章id
return ResponseResult.okResult(apArticle.getId());
}
⑸. Application
编辑 heima-leadnews-service/heima-leadnews-article/src/main/java/com/heima/article/ArticleApplication.java
文件:
@EnableAsync