Recent Releases

3.05.0017 Feb 2017 11:05 minor feature: Made some fine tuning to the hOCR output. Added TSV as another optional output format. ABI break introduced in 3.04.00 with the AnalyseLayout() method. text2image tool - Enable all OpenType ligatures available in a font. This feature requires Pango 1.38 or newer. Training tools - Replaced asserts with tprintf() and exit(1). Cygwin compatibility. Improved multipage tiff processing. Improved the embedded pdf font (pdf.ttf). Enable selection of OCR engine mode from command line. Changed tesseract command line parameter '-psm' to '--psm'. Added new C API for orientation and script detection, removed the old one. Increased minimum autoconf version to 2.59. Removed dead code. many compiler warning. memory and resource leaks. some with the 'Cube' OCR engine. some openCL. Added option to build Tesseract with CMake build system. Implemented CPPAN support for easy Windows building. . Added TSV as another optional output format. ABI break introduced in 3.04.00 with the AnalyseLayout() method. text2image tool - Enable all OpenType ligatures available in a font. This feature requires Pango 1.38 or newer. Training tools - Replaced asserts with tprintf() and exit(1). Cygwin compatibility. Improved multipage tiff processing. Improved the embedded pdf font (pdf.ttf). Enable selection of OCR engine mode from command line. Changed tesseract command line parameter '-psm' to '--psm'. Added new C API for orientation and script detection, removed the old one. Increased minimum autoconf version to 2.59. Removed dead code. many compiler warning. memory and resource leaks. some with the 'Cube' OCR engine. some openCL. Added option to build Tesseract with CMake build system. Implemented CPPAN support for easy Windows building.
4.00.00alpha16 Dec 2016 09:05 minor feature: Remove unneeded definition for NULL. Use different font list and exposures for "lat" language training. Add info for progress monitor, make it visible in doxygen doc; remove?. Add Junicode to neo-Latin fonts. Update ci scripts. Test release build on windows. Update appveyor.yml. Update appveyor.yml. Update appveyor.yml. Training should work now. Update.travis.yml. Update appveyor.yml. Update CMakeLists.txt. Update.travis.yml. Merge branch 'master' of github.com:tesseract-ocr/tesseract. Update CMakeLists.txt. Update leptonica version. Update.travis.yml. Update appveyor.yml. Merge branch 'master' of github.com-egorpugin:egorpugin/tesseract. Update CMakeLists.txt. Improve leptonica search. Make box training work. Compatibility with Leptonica 1.73. Add more include directories. Merge branch 'master' of github.com:tesseract-ocr/tesseract. Update README.md. Update README.md. Update README.md. Replace pdf.ttf with sharp2.ttf, keep name the same. Document hocr_font_info in config. INCOMPATIBLE to hOCR line height information -. varsize array for Microsoft compiler. Only generate dir for HOCR when needed -. Emit fewer "lang" attributes. Add LTR mixed direction test files. Update README.md. compiler warning (signed / unsigned mismatch). Adds char GetHOCRTSVText(int) as placeholder. Copy of char GetHOCRT?. Adds TessHOcrTsvRenderer class for rendering HOCR info in tsv format. Calls TessHOcrTsvRenderer if tessedit_create_hocrtsv is true. Adds hocrtsv file to configs folder. Adds hocrtsv to tessdata/configs/Makefile.am. Adds BoolParam tessedit_create_hocrtsv in class Tesseract. Render output in TSV format. Avoids HTML escaping. Cleanup TSV renderer. hocrtsv references in Makefile. Add inactivity timeout for icu download on windows. move new delete histogramAllChannels inside the #ifdef USE_OPENCL; fi?. Update INSTALL.GIT.md. improve tesseract.pc.in -. solve segfault for box.train;. update Release Notes. Don't display tesseract's banner when quiet
3.04.0117 Feb 2016 10:45 minor feature: Add check for opencl requirements. Rework opencl requirements (configure: error: conditional "AMDEP"?. Typo. GRAPHICS_DISABLED build. Strcasestr needed on Cygwin too. Libicui18n is only called libicuin on mingw, not cygwin. Implement build without cube (-DNO_CUBE_BUILD). Tessedit_create_txt 0 blocks box training. Memmory leak based on (https://code.google.com/p/tesse?. Remove empty header file secname.h. Replace CubeUtils::UTF8ToUTF32 in pdfrenderer. Enable pdfrender with NO_CUBE_BUILD. NO_CUBE_BUILD with reverting to ANDROID_BUILD in baseapi. Improve NO_CUBE_BUILD. in UTF-16BE conversion. Remove extraneous line feed. VC14 compiler. Enable OpenMP support. Turn off optimisation in Microsoft Visual Studio for TextlineProjecti?. Rename README to README.md -. Remove info about VS 2008. to compile tesseract on mac with clang. For OpenCL reported on Apple Mac. Still get -54 on Apple?. VS2010 build. OpenCL build on Mac. Configure.ac for OS X and -framework. Missing "allheaders.h" when compiling with --enable-opencl on OS X. Various clang compilation errors. Get OpenCL to compile on OS X. Configure.ac unconditionally enabling OpenCL. Add ULL to constants which overflow 32 bits. Simplify build and run of ScrollView. Tesstrain.sh: Only fall back to default Latin fonts if none were prov?. Tesstrain.sh: Only set FONTS if they weren't set on the command line. Tesstrain.sh: Initialise fontconfig even if Arial isn't available. Remove --bin_dir option from tesstrain.sh (should use PATH instead). Add --exposures option to tesstrain.sh. Use mktemp to create workspace directory. COPYING: typo found by codespell. Api: typos in comments (all found by codespell). Ccmain: typos in comments and strings. Typo. Ccstruct: typos in comments and strings. Ccutil: typos in comments and strings. Classify: typos in comments and strings. Cube: typos in comments. Cutil: typos in comments. Dict: typos in comments and strings. Doxyfile: typo in comment. Java: typos in comments and strings. Wordrec: ty
3.04.01dev25 Aug 2015 03:45 minor feature: Added OpenCL support (experimental). Many.
3.04.0020 Aug 2015 08:26 minor feature: Tesseract development is now done with git and hosted at github.com (Previously we used Subversion as a vcs and code.google.com for hosting). Tesseract now requires leptonica 1.71 or a higher version. Removed official support for VS 2008. Added support for many more scripts/languages. Major updates to training system as a result of extensive testing on 100 languages. Improved performance with PIC compilation option. Significant change to invisible font system in pdf output to improve correctness and compatibility with external programs, particularly ghostscript. Improved font identification. Major change to improve layout analysis for heavily diacritic languages: Thai, Vietnamese, Kannada, Telugu etc. Fixed problems with shifted baselines so recognition can recover from layout analysis errors. Major refactor to improve speed on difficult images, especially when running a heap checker. Moved params from global in page layout to tesseractclass. Improved single column layout analysis. Allow ocr output to multiple formats using tesseract command line executable. Fixed issues with mixed eng+ara scripts. Improved script consistency in numbers. Major refactor of control.cpp to enable line recognition. Added tesstrain.sh - a master training script. Added ability to text2image training tool to just list available fonts. Added ability to text2image to underline words. Improved efficiency of image processing for PDF output. Added parameter description for each paramater listed with 'print-parameters' command line option. Added font info to hocr output. Enabled streaming input and output of multi-page documents. Many bug fixes.