25,642 research outputs found
An Efficient Hidden Markov Model for Offline Handwritten Numeral Recognition
Traditionally, the performance of ocr algorithms and systems is based on the
recognition of isolated characters. When a system classifies an individual
character, its output is typically a character label or a reject marker that
corresponds to an unrecognized character. By comparing output labels with the
correct labels, the number of correct recognition, substitution errors
misrecognized characters, and rejects unrecognized characters are determined.
Nowadays, although recognition of printed isolated characters is performed with
high accuracy, recognition of handwritten characters still remains an open
problem in the research arena. The ability to identify machine printed
characters in an automated or a semi automated manner has obvious applications
in numerous fields. Since creating an algorithm with a one hundred percent
correct recognition rate is quite probably impossible in our world of noise and
different font styles, it is important to design character recognition
algorithms with these failures in mind so that when mistakes are inevitably
made, they will at least be understandable and predictable to the person
working with theComment: 6pages, 5 figure
Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression
Although fully generative models have been successfully used to model the
contents of text documents, they are often awkward to apply to combinations of
text data and document metadata. In this paper we propose a
Dirichlet-multinomial regression (DMR) topic model that includes a log-linear
prior on document-topic distributions that is a function of observed features
of the document, such as author, publication venue, references, and dates. We
show that by selecting appropriate features, DMR topic models can meet or
exceed the performance of several previously published topic models designed
for specific data.Comment: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty
in Artificial Intelligence (UAI2008
Dialogue Act Recognition via CRF-Attentive Structured Network
Dialogue Act Recognition (DAR) is a challenging problem in dialogue
interpretation, which aims to attach semantic labels to utterances and
characterize the speaker's intention. Currently, many existing approaches
formulate the DAR problem ranging from multi-classification to structured
prediction, which suffer from handcrafted feature extensions and attentive
contextual structural dependencies. In this paper, we consider the problem of
DAR from the viewpoint of extending richer Conditional Random Field (CRF)
structural dependencies without abandoning end-to-end training. We incorporate
hierarchical semantic inference with memory mechanism on the utterance
modeling. We then extend structured attention network to the linear-chain
conditional random field layer which takes into account both contextual
utterances and corresponding dialogue acts. The extensive experiments on two
major benchmark datasets Switchboard Dialogue Act (SWDA) and Meeting Recorder
Dialogue Act (MRDA) datasets show that our method achieves better performance
than other state-of-the-art solutions to the problem. It is a remarkable fact
that our method is nearly close to the human annotator's performance on SWDA
within 2% gap.Comment: 10 pages, 4figure
- …