Optical Character Recognition

Ben Haddej, Dhia elhak; O\u27Brien, Sean Alexander

Optical Character Recognition

Authors: Dhia elhak Ben Haddej
Sean Alexander O\u27Brien
Publication date: 24 April 2012
Publisher: Digital WPI

Abstract

Our project aimed to understand, utilize and improve the open source Optical Character Recognizer (OCR) software, OCRopus, to better handle some of the more complex recognition issues such as unique language alphabets and special characters such as mathematical symbols. We extended the functionality of OCRopus to work with any language by creating support for UTF-8 character encoding. We also created a character and language model for the Hungarian language. This will allow other users of the software to preform character recognition on Hungarian input without having to train a completely new character model

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

DigitalCommons@WPI

oai:digitalcommons.wpi.edu:mqp...

Last time updated on 09/07/2019