72 research outputs found

    Iris : a solution for executing handwritten code

    Get PDF
    This paper presents a novel approach to executing handwritten code, the solution coined Iris. My research falls within the field of mobile app development, handwriting recognition, optical and intelligent character recognition (OCR & ICR), machine learning, as well as various Computer Science-related fields such as domain specific languages, or DSLs. The solution outlined in this paper details a system where one can author code using only a writing utensil (such as a pen), scratch paper (such as a napkin), and a smart phone. Iris leverages the power of the cloud to process an image of handwritten code and return the result to the user. Ultimately, my results show that Iris was able to accurately execute handwritten scripts with various levels of observed accuracy. Future work includes adding more layers of machine learning as well as further pre-processing images prior to OCR

    A comparative study of optical character recognition in health information system

    Get PDF
    MHealth Institutes are transitioning between documents in physical format and digital format. It is pertinent and important to develop applications that helps health professionals on this transition. An application that would aid the process of digitalization of documents was developed using a Python library. To help with the decision of which library to use, a study was made regarding the precision and speed of execution of PyOCR, PyTesseract and TesseOCR.FCT - Fundação para a Ciência e a Tecnologia (undefined

    Assessment of OCR Quality and Font Identification in Historical Documents

    Get PDF
    Mass digitization of historical documents is a challenging problem for optical character recognition (OCR) tools. Issues include noisy backgrounds and faded text due to aging, border/marginal noise, bleed-through, skewing, warping, as well as irregular fonts and page layouts. As a result, OCR tools often produce a large number of spurious bounding boxes (BBs) in addition to those that correspond to words in the document. To improve the OCR output, in this thesis we develop machine-learning methods to assess the quality of historical documents and label/tag documents (with the page problems) in the EEBO/ECCO collections—45 million pages available through the Early Modern OCR Project at Texas A&M University. We present an iterative classification algorithm to automatically label BBs (i.e., as text or noise) based on their spatial distribution and geometry. The approach uses a rule-base classifier to generate initial text/noise labels for each BB, followed by an iterative classifier that refines the initial labels by incorporating local information to each BB, its spatial location, shape and size. When evaluated on a dataset containing over 72,000 manually-labeled BBs from 159 historical documents, the algorithm can classify BBs with 0.95 precision and 0.96 recall. Further evaluation on a collection of 6,775 documents with ground-truth transcriptions shows that the algorithm can also be used to predict document quality (0.7 correlation) and improve OCR transcriptions in 85% of the cases. This thesis also aims at generating font metadata for historical documents. Knowledge of the font can aid OCR system to produce very accurate text transcriptions, but getting font information for 45 million documents is a daunting task. We present an active learning based font identification system that can classify document images into fonts. In active learning, a learner queries the human for labels on examples it finds most informative. We capture the characteristics of the fonts using word image features related to character width, angled strokes, and Zernike moments. To extract page level features, we use bag-of-word feature (BoF) model. A font classification model trained using BoF and active learning requires only 443 labeled instances to achieve 89.3% test accuracy
    • …
    corecore