357,681 research outputs found
CNN training with graph-based sample preselection: application to handwritten character recognition
In this paper, we present a study on sample preselection in large training
data set for CNN-based classification. To do so, we structure the input data
set in a network representation, namely the Relative Neighbourhood Graph, and
then extract some vectors of interest. The proposed preselection method is
evaluated in the context of handwritten character recognition, by using two
data sets, up to several hundred thousands of images. It is shown that the
graph-based preselection can reduce the training data set without degrading the
recognition accuracy of a non pretrained CNN shallow model.Comment: Paper of 10 pages. Minor spelling corrections brought regarding the
v2. Accepted as an oral paper in the 13th IAPR Internationale Workshop on
Document Analysis Systems (DAS 2018
True to Character: Honoring the Intellectual Foundations of the Character Evidence Rule in Domestic Violence Prosecutions
This article calls for a new character evidence rule allowing the admission of prior acts of abuse within the context of a current domestic violence prosecution. Section II discusses the history of domestic violence in America and explores the three ways that the law has condoned domestic violence, including implicit sanction through the effect of the character evidence rule. Section III examines the intellectual background of the character evidence ban. This section also explores the conflict between the character evidence rule and the law\u27s recognition of domestic violence. Further, Section III demonstrates how the character evidence ban violates its underlying principles in the domestic violence context. Finally, Section III articulates rationale for a new character evidence rule in the domestic violence context -- a rule consistent with the rule\u27s original intellectual underpinnings
Context Perception Parallel Decoder for Scene Text Recognition
Scene text recognition (STR) methods have struggled to attain high accuracy
and fast inference speed. Autoregressive (AR)-based STR model uses the
previously recognized characters to decode the next character iteratively. It
shows superiority in terms of accuracy. However, the inference speed is slow
also due to this iteration. Alternatively, parallel decoding (PD)-based STR
model infers all the characters in a single decoding pass. It has advantages in
terms of inference speed but worse accuracy, as it is difficult to build a
robust recognition context in such a pass. In this paper, we first present an
empirical study of AR decoding in STR. In addition to constructing a new AR
model with the top accuracy, we find out that the success of AR decoder lies
also in providing guidance on visual context perception rather than language
modeling as claimed in existing studies. As a consequence, we propose Context
Perception Parallel Decoder (CPPD) to decode the character sequence in a single
PD pass. CPPD devises a character counting module and a character ordering
module. Given a text instance, the former infers the occurrence count of each
character, while the latter deduces the character reading order and
placeholders. Together with the character prediction task, they construct a
context that robustly tells what the character sequence is and where the
characters appear, well mimicking the context conveyed by AR decoding.
Experiments on both English and Chinese benchmarks demonstrate that CPPD models
achieve highly competitive accuracy. Moreover, they run approximately 7x faster
than their AR counterparts, and are also among the fastest recognizers. The
code will be released soon
Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations
Dialog act (DA) recognition is a task that has been widely explored over the
years. Recently, most approaches to the task explored different DNN
architectures to combine the representations of the words in a segment and
generate a segment representation that provides cues for intention. In this
study, we explore means to generate more informative segment representations,
not only by exploring different network architectures, but also by considering
different token representations, not only at the word level, but also at the
character and functional levels. At the word level, in addition to the commonly
used uncontextualized embeddings, we explore the use of contextualized
representations, which provide information concerning word sense and segment
structure. Character-level tokenization is important to capture
intention-related morphological aspects that cannot be captured at the word
level. Finally, the functional level provides an abstraction from words, which
shifts the focus to the structure of the segment. We also explore approaches to
enrich the segment representation with context information from the history of
the dialog, both in terms of the classifications of the surrounding segments
and the turn-taking history. This kind of information has already been proved
important for the disambiguation of DAs in previous studies. Nevertheless, we
are able to capture additional information by considering a summary of the
dialog history and a wider turn-taking context. By combining the best
approaches at each step, we achieve results that surpass the previous
state-of-the-art on generic DA recognition on both SwDA and MRDA, two of the
most widely explored corpora for the task. Furthermore, by considering both
past and future context, simulating annotation scenario, our approach achieves
a performance similar to that of a human annotator on SwDA and surpasses it on
MRDA.Comment: 38 pages, 7 figures, 9 tables, submitted to JAI
Context sensitive optical character recognition using neural networks and hidden Markov models
This thesis investigates a method for using contextual information in text recognition. This is based on the premise that, while reading, humans recognize words with missing or garbled characters by examining the surrounding characters and then selecting the appropriate character. The correct character is chosen based on an inherent knowledge of the language and spelling techniques. We can then model this statistically. The approach taken by this Thesis is to combine feature extraction techniques, Neural Networks and Hidden Markov Modeling. This method of character recognition involves a three step process: pixel image preprocessing, neural network classification and context interpretation. Pixel image preprocessing applies a feature extraction algorithm to original bit mapped images, which produces a feature vector for the original images which are input into a neural network. The neural network performs the initial classification of the characters by producing ten weights, one for each character. The magnitude of the weight is translated into the confidence the network has in each of the choices. The greater the magnitude and separation, the more confident the neural network is of a given choice. The output of the neural network is the input for a context interpreter. The context interpreter uses Hidden Markov Modeling (HMM) techniques to determine the most probable classification for all characters based on the characters that precede that character and character pair statistics. The HMMs are built using an a priori knowledge of the language: a statistical description of the probabilities of digrams. Experimentation and verification of this method combines the development and use of a preprocessor program, a Cascade Correlation Neural Network and a HMM context interpreter program. Results from these experiments show the neural network successfully classified 88.2 percent of the characters. Expanding this to the word level, 63 percent of the words were correctly identified. Adding the Hidden Markov Modeling improved the word recognition to 82.9 percent
- …