引入依赖:
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>javacv-platform</artifactId>
<version>1.5.5</version>
</dependency>
引入中文语言训练数据集:chi_sim
GitHub - tesseract-ocr/tessdata: Trained models with fast variant of the "best" LSTM models + legacy modelsTrained models with fast variant of the "best" LSTM models + legacy models - GitHub - tesseract-ocr/tessdata: Trained models with fast variant of the "best" LSTM models + legacy modelshttps://github.com/tesseract-ocr/tessdata代码示例:
import org.bytedeco.javacpp.BytePointer;
import org.bytedeco.leptonica.PIX;
import org.bytedeco.leptonica.global.lept;
import org.bytedeco.tesseract.TessBaseAPI;
public class JavaCVOcr {
public static String OCR(String lng,String dataPath,String imagePath) {
TessBaseAPI api=new TessBaseAPI();
if (api.Init(dataPath, lng)!=0){
System.out.println("error");
}
PIX image= lept.pixRead(imagePath);
if (image==null){
return "";
}
api.SetImage(image);
BytePointer outText=api.GetUTF8Text();
String result=outText.getString();
api.End();
outText.deallocate();
lept.pixDestroy(image);
return result;
}
public static void main(String[] args) {
String property = System.getProperty("user.dir");
String text= OCR("chi_sim", property, "C:\\Users\\Desktop\\1693147958548.png");
System.out.println(text);
}
}