357,681 research outputs found

    CNN training with graph-based sample preselection: application to handwritten character recognition

    Full text link
    In this paper, we present a study on sample preselection in large training data set for CNN-based classification. To do so, we structure the input data set in a network representation, namely the Relative Neighbourhood Graph, and then extract some vectors of interest. The proposed preselection method is evaluated in the context of handwritten character recognition, by using two data sets, up to several hundred thousands of images. It is shown that the graph-based preselection can reduce the training data set without degrading the recognition accuracy of a non pretrained CNN shallow model.Comment: Paper of 10 pages. Minor spelling corrections brought regarding the v2. Accepted as an oral paper in the 13th IAPR Internationale Workshop on Document Analysis Systems (DAS 2018

    True to Character: Honoring the Intellectual Foundations of the Character Evidence Rule in Domestic Violence Prosecutions

    Get PDF
    This article calls for a new character evidence rule allowing the admission of prior acts of abuse within the context of a current domestic violence prosecution. Section II discusses the history of domestic violence in America and explores the three ways that the law has condoned domestic violence, including implicit sanction through the effect of the character evidence rule. Section III examines the intellectual background of the character evidence ban. This section also explores the conflict between the character evidence rule and the law\u27s recognition of domestic violence. Further, Section III demonstrates how the character evidence ban violates its underlying principles in the domestic violence context. Finally, Section III articulates rationale for a new character evidence rule in the domestic violence context -- a rule consistent with the rule\u27s original intellectual underpinnings

    Context Perception Parallel Decoder for Scene Text Recognition

    Full text link
    Scene text recognition (STR) methods have struggled to attain high accuracy and fast inference speed. Autoregressive (AR)-based STR model uses the previously recognized characters to decode the next character iteratively. It shows superiority in terms of accuracy. However, the inference speed is slow also due to this iteration. Alternatively, parallel decoding (PD)-based STR model infers all the characters in a single decoding pass. It has advantages in terms of inference speed but worse accuracy, as it is difficult to build a robust recognition context in such a pass. In this paper, we first present an empirical study of AR decoding in STR. In addition to constructing a new AR model with the top accuracy, we find out that the success of AR decoder lies also in providing guidance on visual context perception rather than language modeling as claimed in existing studies. As a consequence, we propose Context Perception Parallel Decoder (CPPD) to decode the character sequence in a single PD pass. CPPD devises a character counting module and a character ordering module. Given a text instance, the former infers the occurrence count of each character, while the latter deduces the character reading order and placeholders. Together with the character prediction task, they construct a context that robustly tells what the character sequence is and where the characters appear, well mimicking the context conveyed by AR decoding. Experiments on both English and Chinese benchmarks demonstrate that CPPD models achieve highly competitive accuracy. Moreover, they run approximately 7x faster than their AR counterparts, and are also among the fastest recognizers. The code will be released soon

    Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations

    Get PDF
    Dialog act (DA) recognition is a task that has been widely explored over the years. Recently, most approaches to the task explored different DNN architectures to combine the representations of the words in a segment and generate a segment representation that provides cues for intention. In this study, we explore means to generate more informative segment representations, not only by exploring different network architectures, but also by considering different token representations, not only at the word level, but also at the character and functional levels. At the word level, in addition to the commonly used uncontextualized embeddings, we explore the use of contextualized representations, which provide information concerning word sense and segment structure. Character-level tokenization is important to capture intention-related morphological aspects that cannot be captured at the word level. Finally, the functional level provides an abstraction from words, which shifts the focus to the structure of the segment. We also explore approaches to enrich the segment representation with context information from the history of the dialog, both in terms of the classifications of the surrounding segments and the turn-taking history. This kind of information has already been proved important for the disambiguation of DAs in previous studies. Nevertheless, we are able to capture additional information by considering a summary of the dialog history and a wider turn-taking context. By combining the best approaches at each step, we achieve results that surpass the previous state-of-the-art on generic DA recognition on both SwDA and MRDA, two of the most widely explored corpora for the task. Furthermore, by considering both past and future context, simulating annotation scenario, our approach achieves a performance similar to that of a human annotator on SwDA and surpasses it on MRDA.Comment: 38 pages, 7 figures, 9 tables, submitted to JAI

    Context sensitive optical character recognition using neural networks and hidden Markov models

    Get PDF
    This thesis investigates a method for using contextual information in text recognition. This is based on the premise that, while reading, humans recognize words with missing or garbled characters by examining the surrounding characters and then selecting the appropriate character. The correct character is chosen based on an inherent knowledge of the language and spelling techniques. We can then model this statistically. The approach taken by this Thesis is to combine feature extraction techniques, Neural Networks and Hidden Markov Modeling. This method of character recognition involves a three step process: pixel image preprocessing, neural network classification and context interpretation. Pixel image preprocessing applies a feature extraction algorithm to original bit mapped images, which produces a feature vector for the original images which are input into a neural network. The neural network performs the initial classification of the characters by producing ten weights, one for each character. The magnitude of the weight is translated into the confidence the network has in each of the choices. The greater the magnitude and separation, the more confident the neural network is of a given choice. The output of the neural network is the input for a context interpreter. The context interpreter uses Hidden Markov Modeling (HMM) techniques to determine the most probable classification for all characters based on the characters that precede that character and character pair statistics. The HMMs are built using an a priori knowledge of the language: a statistical description of the probabilities of digrams. Experimentation and verification of this method combines the development and use of a preprocessor program, a Cascade Correlation Neural Network and a HMM context interpreter program. Results from these experiments show the neural network successfully classified 88.2 percent of the characters. Expanding this to the word level, 63 percent of the words were correctly identified. Adding the Hidden Markov Modeling improved the word recognition to 82.9 percent
    • …
    corecore