5 research outputs found

    Content Recognition and Context Modeling for Document Analysis and Retrieval

    Get PDF
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance

    Computer Vision and Image Processing Techniques for Mobile Applications

    Get PDF
    Camera phones have penetrated every corner of society and have become a focal point for communications. In our research we extend the traditional use of such devices to help bridge the gap between physical and digital worlds. Their combined image acquisition, processing, storage, and communication capabilities in a compact, portable device make them an ideal platform for embedding computer vision and image processing capabilities in the pursuit of new mobile applications. This dissertation is presented as a series of computer vision and image processing techniques together with their applications on the mobile device. We have developed a set of techniques for ego-motion estimation, enhancement, feature extraction, perspective correction, object detection, and document retrieval that serve as a basis for such applications. Our applications include a dynamic video barcode that can transfer significant amounts of information visually, a document retrieval system that can retrieve documents from low resolution snapshots, and a series of applications for the users with visual disabilities such as a currency reader. Solutions for mobile devices require a fundamentally different approach than traditional vision techniques that run on traditional computers, so we consider user-device interaction and the fact that these algorithms must execute in a resource constrained environment. For each problem we perform both theoretical and empirical analysis in an attempt to optimize performance and usability. The thesis makes contributions related to efficient implementation of image processing and computer vision techniques, analysis of information theory, feature extraction and analysis of low quality images, and device usability

    Automatic Signature Verification: The State of the Art

    Full text link

    SEARCHING HETEROGENEOUS DOCUMENT IMAGE COLLECTIONS

    Get PDF
    A decrease in data storage costs and widespread use of scanning devices has led to massive quantities of scanned digital documents in corporations, organizations, and governments around the world. Automatically processing these large heterogeneous collections can be difficult due to considerable variation in resolution, quality, font, layout, noise, and content. In order to make this data available to a wide audience, methods for efficient retrieval and analysis from large collections of document images remain an open and important area of research. In this proposal, we present research in three areas that augment the current state of the art in the retrieval and analysis of large heterogeneous document image collections. First, we explore an efficient approach to document image retrieval, which allows users to perform retrieval against large image collections in a query-by-example manner. Our approach is compared to text retrieval of OCR on a collection of 7 million document images collected from lawsuits against tobacco companies. Next, we present research in document verification and change detection, where one may want to quickly determine if two document images contain any differences (document verification) and if so, to determine precisely what and where changes have occurred (change detection). A motivating example is legal contracts, where scanned images are often e-mailed back and forth and small changes can have severe ramifications. Finally, approaches useful for exploiting the biometric properties of handwriting in order to perform writer identification and retrieval in document images are examined

    Document image retrieval using signatures as queries

    No full text
    In searching a repository of business documents, a task of interest is that of using a query signature image to retrieve from a database, other signatures matching the query. The signature retrieval task involves a two-step process of extracting all the signatures from the documents and then performing a match on these signatures. This paper presents a novel signature retrieval strategy, which includes a technique for noise and printed text removal from signature images, previously extracted from business documents. Signature matching is based on a normalized correlation similarity measure using global shape-based binary feature vectors. In a retrieval task involving a database of 447 signatures, on an average 4.43 out of the top 5 choices were signatures belonging to the writer of the queried signature. On considering the Top 10 ranks, a F-measure value of 76.3 was obtained and the precision and recall values at this Fmeasure were 74.5 % and 78.28 % respectively. 1
    corecore