ITEEDU

Tesseract-ocr主要接口函数

Tesseract使用C++开发，提供了一系列函数，在官网上还能下载到编译好的应用程序供测试，同时也可以下载源码，然后在不同的平台上进行编译，生成其所支持的某种平台上OCR识别工具。
对于可跨平台的源码级工具。它的特点是，在不同的环境下运行该工具，对其源码进行重新编译即可。这一点有些类似于Qt。
也就是说我们下载了它的源码，可以在windows的VS2008下编译；也可以在Linux下编译。
比如linux下边，可以通过：./configuer；Make，make install来完成安装。
在指定目录下生成：
Bin：一些工具和命令；
Include：一些头文件；
Lib：一些库文件；
Share：配置文件和需要的语言包。
移植的时候需要将lib中的.so文件拷贝到目标机上。

主要接口函数

初始化的接口函数

(1) int Init(const char* datapath, const char* language, char **configs, int configs_size, bool configs_global_only);
(2) int Init(const char* datapath, const char* language) { return Init(datapath, language, 0, 0, false); }
(3) int InitLangMod(const char* datapath, const char* language);
(4) int InitWithoutLangModel(const char* datapath, const char* language);

处理图片文字信息的接口函数

(1) char* TesseractRect(const unsigned char* imagedata, int bytes_per_pixel, int bytes_per_line, int left, int top, int width, int height);
(2) void SetImage(const unsigned char* imagedata, int width, int height, int bytes_per_pixel, int bytes_per_line);//为处理图片文字信息提供image对象和参数的 char* GetUTF8Text();
(3)char* GetUNLVText();

读取图片的接口函数

(1) INT8 IMAGE::read_header ( const char * name );
(2) inT32 check_legal_image_size(                     //get rest of image
inT32 x,                      //x size required
inT32 y,                    //ysize required
inT8 bits_per_pixel //bpp required
);
(3)inT8 read(inT32 buflines);

图片加工和获取相关参数

(1) void invert_image( /*invert the image */ IMAGE *image /*image ot invert */ );
(2)inT32 get_xsize() {
return xsize;
}
(3) inT32 get_ysize() {
return ysize;
}
(4) inT8 get_bpp() {
return bpp;
}
(5) inT8 get_bps() {
return bps;
}