go-ocr 0.4.2

go-ocr is a tool for extracting plain text from scanned documents in pdf or djvu formats, and postprocessing of the text using user-defined rewriting rules to remove OCR artefacts and irregularities.

Tags ocr go
License BSDL
State beta

Recent Releases

0.4.203 Aug 2016 03:15 minor feature: in the call to os.OpenFile(). Error handling rationalised. Version increment.
0.4.125 Jul 2016 03:15 minor feature: Added check for older versions of pdfimages. Version increment.
0.4.018 Jul 2016 19:19 major feature: Major changes: - Added support for djvu files; - Project renamed to go-ocr