|Tags||c++ c ocr library cli|
4.0.0-beta.324 Jun 2018 14:25 minor feature: Remove more header files from public API.
4.0.0-beta.220 Jun 2018 03:17 minor feature: Download the leptonica source from github . . . Add new line to a few error messages. . . . filenames in comments. . . . from pull of cleanups: clang tidied, reviewed, new, . . Added script-specific validation and normalization for virama-using s . . build broken by previous commits that added use of string in lo . . Deleted some dead LSTM code, making everything use the recoder. . Removed changes from last commit that didn't belong. . Move LSTM unicharset and recoder to traineddata with version string p . . type of bit values. . wrong data type in argument for sscanf. . Remove extra semicolons. . windows build. . . . regression of. PangoFontInfo: Remove unused method is_fraktur. . PangoFontInfo: Remove unused method is_monospace. . PangoFontInfo: Remove unused method is_smallcaps. . PangoFontInfo: Remove unused method is_bold. . PangoFontInfo: Remove unused method is_italic. . Use lept_free to free memory allocated by Leptonica. . regression of again!. . . . . . BestPix to always return the highest resolution available, even . . Removed unnecessary using statements and cleaned up google/non-google . . Important to RTL languages saves last space on each line, which w . . clang tidy on previous pull. . Add googletest submodule. . cmake: Add googletest. . googletest: Add dummy test. . Changed the way unicharsets are handled to allow support for the ch . . Rewrote the recoder to use an encoding based on wubi instead of radic . . Define std::max under VS2017 x64. . . . . . Part 2 of separating out the unicharset from the LSTM model, ing c . . Added ADAM optimizer, unless git screwed it up, cos there is no diff. . Removed errors introduced by git merge. . Added AVX2 and AVX512 detector. . Added convert to int and directory listing to combine_tessdata. .
4.0.0-beta.111 Mar 2018 19:05 minor feature: Remove unused method TessdataManager::OverwriteEntry . Remove unused method TessdataManager::LoadFileLater. . crash if output file could not be opened. . : cleanup. . : inside main() use return rather than exit. . . . . . Improve robustness of TessdataManager. . . automake: Enable all warnings and a warning. . . . genericvector: Add overloaded LoadDataFromFile. . Remove unneeded null pointer check. . . . Replace Standard C library header files by C++ header files. . Remove obsolete comments and unused code from ccutil/host.h. . . . EquationDetect: Remove unneeded new / delete operations. . . . and improve Dockerfile. . . . opencl: Remove more unused code. . . . README: Add Coverity badge. . . . Update README.md. . Reduce number of new / delete operations for class KDTreeSearch. . Reduce number of new / delete operations for class LanguageModel. . . . UNICHARSET: Add missing initialization. . . Optimize LSTM code for builds without OpenMP. . . . use correct name for Mac OS X, correct link to training wiki;. Update documentation for installation. . . . Reorganize Readme.md. . Update Template. . Add link to ` the guidelines for this repository`. . Add link to guidelines for this repository. . Add badges for Doxygen and Wiki documentation. . typo. . Update readme for 3.05.01. . StringRenderer::pen_color_: int 3 - double 3 . . Change Mac OS X - macOS. . PangoFontInfo: Remove unused method is_fraktur. . Remove strcasestr which is no longer needed. . . . . . . . . . . . PangoFontInfo: Remove unused method is_monospace. . PangoFontInfo: Remove unused method is_smallcaps. . PangoFontInfo: Remove unused method is_bold. . PangoFontInfo: Remove unused method is_italic. . Make less verbose. . . . . . opencl: Remove unused code. . opencl: some compiler warnings. . . . LSTMTrainer: Catch empty vectors. . Update from Leptonica 1.74.1 to 1.74.2. . Travis CI for Leptonica 1.74.2. . . . Remove local implementation of
3.05.0102 Jun 2017 06:39 major bugfix: Bugfix release for stable tesseract version
3.05.0017 Feb 2017 11:05 minor feature: Made some fine tuning to the hOCR output. Added TSV as another optional output format. ABI break introduced in 3.04.00 with the AnalyseLayout() method. text2image tool - Enable all OpenType ligatures available in a font. This feature requires Pango 1.38 or newer. Training tools - Replaced asserts with tprintf() and exit(1). Cygwin compatibility. Improved multipage tiff processing. Improved the embedded pdf font (pdf.ttf). Enable selection of OCR engine mode from command line. Changed tesseract command line parameter '-psm' to '--psm'. Added new C API for orientation and script detection, removed the old one. Increased minimum autoconf version to 2.59. Removed dead code. many compiler warning. memory and resource leaks. some with the 'Cube' OCR engine. some openCL. Added option to build Tesseract with CMake build system. Implemented CPPAN support for easy Windows building. . Added TSV as another optional output format. ABI break introduced in 3.04.00 with the AnalyseLayout() method. text2image tool - Enable all OpenType ligatures available in a font. This feature requires Pango 1.38 or newer. Training tools - Replaced asserts with tprintf() and exit(1). Cygwin compatibility. Improved multipage tiff processing. Improved the embedded pdf font (pdf.ttf). Enable selection of OCR engine mode from command line. Changed tesseract command line parameter '-psm' to '--psm'. Added new C API for orientation and script detection, removed the old one. Increased minimum autoconf version to 2.59. Removed dead code. many compiler warning. memory and resource leaks. some with the 'Cube' OCR engine. some openCL. Added option to build Tesseract with CMake build system. Implemented CPPAN support for easy Windows building.
4.00.00alpha16 Dec 2016 09:05 minor feature: Remove unneeded definition for NULL. Use different font list and exposures for "lat" language training. Add info for progress monitor, make it visible in doxygen doc; remove?. Add Junicode to neo-Latin fonts. Update ci scripts. Test release build on windows. Update appveyor.yml. Update appveyor.yml. Update appveyor.yml. Training should work now. Update.travis.yml. Update appveyor.yml. Update CMakeLists.txt. Update.travis.yml. Merge branch 'master' of github.com:tesseract-ocr/tesseract. Update CMakeLists.txt. Update leptonica version. Update.travis.yml. Update appveyor.yml. Merge branch 'master' of github.com-egorpugin:egorpugin/tesseract. Update CMakeLists.txt. Improve leptonica search. Make box training work. Compatibility with Leptonica 1.73. Add more include directories. Merge branch 'master' of github.com:tesseract-ocr/tesseract. Update README.md. Update README.md. Update README.md. Replace pdf.ttf with sharp2.ttf, keep name the same. Document hocr_font_info in config. INCOMPATIBLE to hOCR line height information -. varsize array for Microsoft compiler. Only generate dir for HOCR when needed -. Emit fewer "lang" attributes. Add LTR mixed direction test files. Update README.md. compiler warning (signed / unsigned mismatch). Adds char GetHOCRTSVText(int) as placeholder. Copy of char GetHOCRT?. Adds TessHOcrTsvRenderer class for rendering HOCR info in tsv format. Calls TessHOcrTsvRenderer if tessedit_create_hocrtsv is true. Adds hocrtsv file to configs folder. Adds hocrtsv to tessdata/configs/Makefile.am. Adds BoolParam tessedit_create_hocrtsv in class Tesseract. Render output in TSV format. Avoids HTML escaping. Cleanup TSV renderer. hocrtsv references in Makefile. Add inactivity timeout for icu download on windows. move new delete histogramAllChannels inside the #ifdef USE_OPENCL; fi?. Update INSTALL.GIT.md. improve tesseract.pc.in -. solve segfault for box.train;. update Release Notes. Don't display tesseract's banner when quiet
3.04.0117 Feb 2016 10:45 minor feature: Add check for opencl requirements. Rework opencl requirements (configure: error: conditional "AMDEP"?. Typo. GRAPHICS_DISABLED build. Strcasestr needed on Cygwin too. Libicui18n is only called libicuin on mingw, not cygwin. Implement build without cube (-DNO_CUBE_BUILD). Tessedit_create_txt 0 blocks box training. Memmory leak based on (https://code.google.com/p/tesse?. Remove empty header file secname.h. Replace CubeUtils::UTF8ToUTF32 in pdfrenderer. Enable pdfrender with NO_CUBE_BUILD. NO_CUBE_BUILD with reverting to ANDROID_BUILD in baseapi. Improve NO_CUBE_BUILD. in UTF-16BE conversion. Remove extraneous line feed. VC14 compiler. Enable OpenMP support. Turn off optimisation in Microsoft Visual Studio for TextlineProjecti?. Rename README to README.md -. Remove info about VS 2008. to compile tesseract on mac with clang. For OpenCL reported on Apple Mac. Still get -54 on Apple?. VS2010 build. OpenCL build on Mac. Configure.ac for OS X and -framework. Missing "allheaders.h" when compiling with --enable-opencl on OS X. Various clang compilation errors. Get OpenCL to compile on OS X. Configure.ac unconditionally enabling OpenCL. Add ULL to constants which overflow 32 bits. Simplify build and run of ScrollView. Tesstrain.sh: Only fall back to default Latin fonts if none were prov?. Tesstrain.sh: Only set FONTS if they weren't set on the command line. Tesstrain.sh: Initialise fontconfig even if Arial isn't available. Remove --bin_dir option from tesstrain.sh (should use PATH instead). Add --exposures option to tesstrain.sh. Use mktemp to create workspace directory. COPYING: typo found by codespell. Api: typos in comments (all found by codespell). Ccmain: typos in comments and strings. Typo. Ccstruct: typos in comments and strings. Ccutil: typos in comments and strings. Classify: typos in comments and strings. Cube: typos in comments. Cutil: typos in comments. Dict: typos in comments and strings. Doxyfile: typo in comment. Java: typos in comments and strings. Wordrec: ty
3.04.01dev25 Aug 2015 03:45 minor feature: Added OpenCL support (experimental). Many.
3.04.0020 Aug 2015 08:26 minor feature: Tesseract development is now done with git and hosted at github.com (Previously we used Subversion as a vcs and code.google.com for hosting). Tesseract now requires leptonica 1.71 or a higher version. Removed official support for VS 2008. Added support for many more scripts/languages. Major updates to training system as a result of extensive testing on 100 languages. Improved performance with PIC compilation option. Significant change to invisible font system in pdf output to improve correctness and compatibility with external programs, particularly ghostscript. Improved font identification. Major change to improve layout analysis for heavily diacritic languages: Thai, Vietnamese, Kannada, Telugu etc. Fixed problems with shifted baselines so recognition can recover from layout analysis errors. Major refactor to improve speed on difficult images, especially when running a heap checker. Moved params from global in page layout to tesseractclass. Improved single column layout analysis. Allow ocr output to multiple formats using tesseract command line executable. Fixed issues with mixed eng+ara scripts. Improved script consistency in numbers. Major refactor of control.cpp to enable line recognition. Added tesstrain.sh - a master training script. Added ability to text2image training tool to just list available fonts. Added ability to text2image to underline words. Improved efficiency of image processing for PDF output. Added parameter description for each paramater listed with 'print-parameters' command line option. Added font info to hocr output. Enabled streaming input and output of multi-page documents. Many bug fixes.
ManageYou can also help out here by:
← Update project
or flagging this entry for moderator attention.
Share project 1