1 research outputs found

    Feature recognition in OCR text

    Full text link
    This thesis investigates the recognition and extraction of special word sequences, representing concepts, from OCR text. Unlike general index terms, concepts can consist of one or more terms that combined, have higher retrieval value than the terms alone (i.e. acronyms, proper nouns, phrases). An algorithm to recognize acronyms and their definitions will be presented. An evaluation of the algorithm will also be presented
    corecore