9,401 research outputs found

    Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition

    Get PDF
    Online handwritten Chinese text recognition (OHCTR) is a challenging problem as it involves a large-scale character set, ambiguous segmentation, and variable-length input sequences. In this paper, we exploit the outstanding capability of path signature to translate online pen-tip trajectories into informative signature feature maps using a sliding window-based method, successfully capturing the analytic and geometric properties of pen strokes with strong local invariance and robustness. A multi-spatial-context fully convolutional recurrent network (MCFCRN) is proposed to exploit the multiple spatial contexts from the signature feature maps and generate a prediction sequence while completely avoiding the difficult segmentation problem. Furthermore, an implicit language model is developed to make predictions based on semantic context within a predicting feature sequence, providing a new perspective for incorporating lexicon constraints and prior knowledge about a certain language in the recognition procedure. Experiments on two standard benchmarks, Dataset-CASIA and Dataset-ICDAR, yielded outstanding results, with correct rates of 97.10% and 97.15%, respectively, which are significantly better than the best result reported thus far in the literature.Comment: 14 pages, 9 figure

    Automatic Structural Scene Digitalization

    Get PDF
    In this paper, we present an automatic system for the analysis and labeling of structural scenes, floor plan drawings in Computer-aided Design (CAD) format. The proposed system applies a fusion strategy to detect and recognize various components of CAD floor plans, such as walls, doors, windows and other ambiguous assets. Technically, a general rule-based filter parsing method is fist adopted to extract effective information from the original floor plan. Then, an image-processing based recovery method is employed to correct information extracted in the first step. Our proposed method is fully automatic and real-time. Such analysis system provides high accuracy and is also evaluated on a public website that, on average, archives more than ten thousands effective uses per day and reaches a relatively high satisfaction rate.Comment: paper submitted to PloS On

    Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition

    Full text link
    Handwritten mathematical expression recognition is a challenging problem due to the complicated two-dimensional structures, ambiguous handwriting input and variant scales of handwritten math symbols. To settle this problem, we utilize the attention based encoder-decoder model that recognizes mathematical expression images from two-dimensional layouts to one-dimensional LaTeX strings. We improve the encoder by employing densely connected convolutional networks as they can strengthen feature extraction and facilitate gradient propagation especially on a small training set. We also present a novel multi-scale attention model which is employed to deal with the recognition of math symbols in different scales and save the fine-grained details that will be dropped by pooling operations. Validated on the CROHME competition task, the proposed method significantly outperforms the state-of-the-art methods with an expression recognition accuracy of 52.8% on CROHME 2014 and 50.1% on CROHME 2016, by only using the official training dataset

    Unsupervised Extraction of Representative Concepts from Scientific Literature

    Full text link
    This paper studies the automated categorization and extraction of scientific concepts from titles of scientific articles, in order to gain a deeper understanding of their key contributions and facilitate the construction of a generic academic knowledgebase. Towards this goal, we propose an unsupervised, domain-independent, and scalable two-phase algorithm to type and extract key concept mentions into aspects of interest (e.g., Techniques, Applications, etc.). In the first phase of our algorithm we propose PhraseType, a probabilistic generative model which exploits textual features and limited POS tags to broadly segment text snippets into aspect-typed phrases. We extend this model to simultaneously learn aspect-specific features and identify academic domains in multi-domain corpora, since the two tasks mutually enhance each other. In the second phase, we propose an approach based on adaptor grammars to extract fine grained concept mentions from the aspect-typed phrases without the need for any external resources or human effort, in a purely data-driven manner. We apply our technique to study literature from diverse scientific domains and show significant gains over state-of-the-art concept extraction techniques. We also present a qualitative analysis of the results obtained.Comment: Published as a conference paper at CIKM 201

    SymbolDesign: A User-centered Method to Design Pen-based Interfaces and Extend the Functionality of Pointer Input Devices

    Full text link
    A method called "SymbolDesign" is proposed that can be used to design user-centered interfaces for pen-based input devices. It can also extend the functionality of pointer input devices such as the traditional computer mouse or the Camera Mouse, a camera-based computer interface. Users can create their own interfaces by choosing single-stroke movement patterns that are convenient to draw with the selected input device and by mapping them to a desired set of commands. A pattern could be the trace of a moving finger detected with the Camera Mouse or a symbol drawn with an optical pen. The core of the SymbolDesign system is a dynamically created classifier, in the current implementation an artificial neural network. The architecture of the neural network automatically adjusts according to the complexity of the classification task. In experiments, subjects used the SymbolDesign method to design and test the interfaces they created, for example, to browse the web. The experiments demonstrated good recognition accuracy and responsiveness of the user interfaces. The method provided an easily-designed and easily-used computer input mechanism for people without physical limitations, and, with some modifications, has the potential to become a computer access tool for people with severe paralysis.National Science Foundation (IIS-0093367, IIS-0308213, IIS-0329009, EIA-0202067
    • …
    corecore