152 research outputs found

    Content Recognition and Context Modeling for Document Analysis and Retrieval

    Get PDF
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    Academic E-Books: Publishers, Librarians, and Users

    Get PDF
    Academic E-Books: Publishers, Librarians, and Users provides readers with a view of the changing and emerging roles of electronic books in higher education. The three main sections contain contributions by experts in the publisher/vendor arena, as well as by librarians who report on both the challenges of offering and managing e-books and on the issues surrounding patron use of e-books. The case study section offers perspectives from seven different sizes and types of libraries whose librarians describe innovative and thought-provoking projects involving e-books. Read about perspectives on e-books from organizations as diverse as a commercial publisher and an association press. Learn about the viewpoint of a jobber. Find out about the e-book challenges facing librarians, such as the quest to control costs in the patron-driven acquisitions (PDA) model, how to solve the dilemma of resource sharing with e-books, and how to manage PDA in the consortial environment. See what patron use of e-books reveals about reading habits and disciplinary differences. Finally, in the case study section, discover how to promote scholarly e-books, how to manage an e-reader checkout program, and how one library replaced most of its print collection with e-books. These and other examples illustrate how innovative librarians use e-books to enhance users’ experiences with scholarly works.https://docs.lib.purdue.edu/purduepress_ebooks/1036/thumbnail.jp

    Text-detection and -recognition from natural images

    Get PDF
    Text detection and recognition from images could have numerous functional applications for document analysis, such as assistance for visually impaired people; recognition of vehicle license plates; evaluation of articles containing tables, street signs, maps, and diagrams; keyword-based image exploration; document retrieval; recognition of parts within industrial automation; content-based extraction; object recognition; address block location; and text-based video indexing. This research exploited the advantages of artificial intelligence (AI) to detect and recognise text from natural images. Machine learning and deep learning were used to accomplish this task.In this research, we conducted an in-depth literature review on the current detection and recognition methods used by researchers to identify the existing challenges, wherein the differences in text resulting from disparity in alignment, style, size, and orientation combined with low image contrast and a complex background make automatic text extraction a considerably challenging and problematic task. Therefore, the state-of-the-art suggested approaches obtain low detection rates (often less than 80%) and recognition rates (often less than 60%). This has led to the development of new approaches. The aim of the study was to develop a robust text detection and recognition method from natural images with high accuracy and recall, which would be used as the target of the experiments. This method could detect all the text in the scene images, despite certain specific features associated with the text pattern. Furthermore, we aimed to find a solution to the two main problems concerning arbitrarily shaped text (horizontal, multi-oriented, and curved text) detection and recognition in a low-resolution scene and with various scales and of different sizes.In this research, we propose a methodology to handle the problem of text detection by using novel combination and selection features to deal with the classification algorithms of the text/non-text regions. The text-region candidates were extracted from the grey-scale images by using the MSER technique. A machine learning-based method was then applied to refine and validate the initial detection. The effectiveness of the features based on the aspect ratio, GLCM, LBP, and HOG descriptors was investigated. The text-region classifiers of MLP, SVM, and RF were trained using selections of these features and their combinations. The publicly available datasets ICDAR 2003 and ICDAR 2011 were used to evaluate the proposed method. This method achieved the state-of-the-art performance by using machine learning methodologies on both databases, and the improvements were significant in terms of Precision, Recall, and F-measure. The F-measure for ICDAR 2003 and ICDAR 2011 was 81% and 84%, respectively. The results showed that the use of a suitable feature combination and selection approach could significantly increase the accuracy of the algorithms.A new dataset has been proposed to fill the gap of character-level annotation and the availability of text in different orientations and of curved text. The proposed dataset was created particularly for deep learning methods which require a massive completed and varying range of training data. The proposed dataset includes 2,100 images annotated at the character and word levels to obtain 38,500 samples of English characters and 12,500 words. Furthermore, an augmentation tool has been proposed to support the proposed dataset. The missing of object detection augmentation tool encroach to proposed tool which has the ability to update the position of bounding boxes after applying transformations on images. This technique helps to increase the number of samples in the dataset and reduce the time of annotations where no annotation is required. The final part of the thesis presents a novel approach for text spotting, which is a new framework for an end-to-end character detection and recognition system designed using an improved SSD convolutional neural network, wherein layers are added to the SSD networks and the aspect ratio of the characters is considered because it is different from that of the other objects. Compared with the other methods considered, the proposed method could detect and recognise characters by training the end-to-end model completely. The performance of the proposed method was better on the proposed dataset; it was 90.34. Furthermore, the F-measure of the method’s accuracy on ICDAR 2015, ICDAR 2013, and SVT was 84.5, 91.9, and 54.8, respectively. On ICDAR13, the method achieved the second-best accuracy. The proposed method could spot text in arbitrarily shaped (horizontal, oriented, and curved) scene text.</div

    Academic E-Books

    Get PDF
    Academic E-Books: Publishers, Librarians, and Users provides readers with a view of the changing and emerging roles of electronic books in higher education. The three main sections contain contributions by experts in the publisher/vendor arena, as well as by librarians who report on both the challenges of offering and managing e-books and on the issues surrounding patron use of e-books. The case study section offers perspectives from seven different sizes and types of libraries whose librarians describe innovative and thought-provoking projects involving e-books

    Non-Intrusive Continuous User Authentication for Mobile Devices

    Get PDF
    The modern mobile device has become an everyday tool for users and business. Technological advancements in the device itself and the networks that connect them have enabled a range of services and data access which have introduced a subsequent increased security risk. Given the latter, the security requirements need to be re-evaluated and authentication is a key countermeasure in this regard. However, it has traditionally been poorly served and would benefit from research to better understand how authentication can be provided to establish sufficient trust. This thesis investigates the security requirements of mobile devices through literature as well as acquiring the user’s perspectives. Given the findings it proposes biometric authentication as a means to establish a more trustworthy approach to user authentication and considers the applicability and topology considerations. Given the different risk and requirements, an authentication framework that offers transparent and continuous is developed. A thorough end-user evaluation of the model demonstrates many positive aspects of transparent authentication. The technical evaluation however, does raise a number of operational challenges that are difficult to achieve in a practical deployment. The research continues to model and simulate the operation of the framework in an controlled environment seeking to identify and correlate the key attributes of the system. Based upon these results and a number of novel adaptations are proposed to overcome the operational challenges and improve upon the impostor detection rate. The new approach to the framework simplifies the approach significantly and improves upon the security of the system, whilst maintaining an acceptable level of usability

    A large vocabulary online handwriting recognition system for Turkish

    Get PDF
    Handwriting recognition in general and online handwriting recognition in particular has been an active research area for several decades. Most of the research have been focused on English and recently on other scripts like Arabic and Chinese. There is a lack of research on recognition in Turkish text and this work primarily fills that gap with a state-of-the-art recognizer for the first time. It contains design and implementation details of a complete recognition system for recognition of Turkish isolated words. Based on the Hidden Markov Models, the system comprises pre-processing, feature extraction, optical modeling and language modeling modules. It considers the recognition of unconstrained handwriting with a limited vocabulary size first and then evolves to a large vocabulary system. Turkish script has many similarities with other Latin scripts, like English, which makes it possible to adapt strategies that work for them. However, there are some other issues which are particular to Turkish that should be taken into consideration separately. Two of the challenging issues in recognition of Turkish text are determined as delayed strokes which introduce an extra source of variation in the sequence order of the handwritten input and high Out-of-Vocabulary (OOV) rate of Turkish when words are used as vocabulary units in the decoding process. This work examines the problems and alternative solutions at depth and proposes suitable solutions for Turkish script particularly. In delayed stroke handling, first a clear definition of the delayed strokes is developed and then using that definition some alternative handling methods are evaluated extensively on the UNIPEN and Turkish datasets. The best results are obtained by removing all delayed strokes, with up to 2.13% and 2.03% points recognition accuracy increases, over the respective baselines of English and Turkish. The overall system performances are assessed as 86.1% with a 1,000-word lexicon and 83.0% with a 3,500-word lexicon on the UNIPEN dataset and 91.7% on the Turkish dataset. Alternative decoding vocabularies are designed with grammatical sub-lexical units in order to solve the problem of high OOV rate. Additionally, statistical bi-gram and tri-gram language models are applied during the decoding process. The best performance, 67.9% is obtained by the large stem-ending vocabulary that is expanded with a bi-gram model on the Turkish dataset. This result is superior to the accuracy of the word-based vocabulary (63.8%) with the same coverage of 95% on the BOUN Web Corpus

    Punishing Sexual Fantasy

    Full text link
    The Internet has created unprecedented opportunities for adults and teenagers to explore their sexual identities, but it has also created new ways for the law to monitor and punish a diverse range of taboo sexual communication. A young mother loses custody of her two children due to sexually explicit Facebook conversations. A teenager is prosecuted for child pornography crimes after sending a naked selfie to her teenage boyfriend. An NYPD officer is convicted for conspiracy to kidnap several women based on conversations he had on a “dark fetish” fantasy website. In each of these cases, online sexual exploration and fantasy easily convert into damning evidence admissible in court. This Article reveals a widespread and overlooked pattern of harshly punishing individuals for exploring their sexual fantasies on the Internet. It shows, for the first time, that judges and juries have repeatedly conflated sexual fantasy with harmful criminal conduct, have largely been dismissive of fantasy-based defenses, and have relaxed evidentiary standards to prejudice individuals whose desires provoke disapproval or disgust. Even as celebrated decisions by the United States Supreme Court provide broader constitutional protection to sexual minorities, this Article shows that actual venues for exploring sexuality remain on the social and legal margins. Drawing from recent criminal law, family law, and First Amendment cases, this Article shows that courts have struggled to adapt free speech, privacy, and due process principles to the uncomfortable realities of the digital environment
    • …
    corecore