322,894 research outputs found
Recommended from our members
Word shape analysis for a hybrid recognition system
This paper describes two wholistic recognizers developed for use in a hybrid recognition system. The recognizers use information about the word shape. This information is strongly related to word zoning. One of the recognizers is explicitly limited by the accuracy of the zoning information extraction. The other recognizer is designed so as to avoid this limitation. The recognizers use very simple sets of features and fuzzy set based pattern matching techniques. This not only aims to increase their robustness, but also causes problems with disambiguation of the results. A verification mechanism, using letter alternatives as compound features, is introduced. Letter alternatives are obtained from a segmentation based recognizer coexisting in the hybrid system. Despite some remaining disambiguation problems, wholistic recognizers are found capable of outperforming the segmentation based recognizer. When working together in a hybrid system, the results are significantly higher than that of the individual recognizers. Recognition results are reported and compared
Determining the Unithood of Word Sequences using Mutual Information and Independence Measure
Most works related to unithood were conducted as part of a larger effort for
the determination of termhood. Consequently, the number of independent research
that study the notion of unithood and produce dedicated techniques for
measuring unithood is extremely small. We propose a new approach, independent
of any influences of termhood, that provides dedicated measures to gather
linguistic evidence from parsed text and statistical evidence from Google
search engine for the measurement of unithood. Our evaluations revealed a
precision and recall of 98.68% and 91.82% respectively with an accuracy at
95.42% in measuring the unithood of 1005 test cases.Comment: More information is available at
http://explorer.csse.uwa.edu.au/reference
Neural Reranking for Named Entity Recognition
We propose a neural reranking system for named entity recognition (NER). The
basic idea is to leverage recurrent neural network models to learn
sentence-level patterns that involve named entity mentions. In particular,
given an output sentence produced by a baseline NER model, we replace all
entity mentions, such as \textit{Barack Obama}, into their entity types, such
as \textit{PER}. The resulting sentence patterns contain direct output
information, yet is less sparse without specific named entities. For example,
"PER was born in LOC" can be such a pattern. LSTM and CNN structures are
utilised for learning deep representations of such sentences for reranking.
Results show that our system can significantly improve the NER accuracies over
two different baselines, giving the best reported results on a standard
benchmark.Comment: Accepted as regular paper by RANLP 201
Multimedia information technology and the annotation of video
The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning
A Computer-Based Method to Improve the Spelling of Children with Dyslexia
In this paper we present a method which aims to improve the spelling of
children with dyslexia through playful and targeted exercises. In contrast to
previous approaches, our method does not use correct words or positive examples
to follow, but presents the child a misspelled word as an exercise to solve. We
created these training exercises on the basis of the linguistic knowledge
extracted from the errors found in texts written by children with dyslexia. To
test the effectiveness of this method in Spanish, we integrated the exercises
in a game for iPad, DysEggxia (Piruletras in Spanish), and carried out a
within-subject experiment. During eight weeks, 48 children played either
DysEggxia or Word Search, which is another word game. We conducted tests and
questionnaires at the beginning of the study, after four weeks when the games
were switched, and at the end of the study. The children who played DysEggxia
for four weeks in a row had significantly less writing errors in the tests that
after playing Word Search for the same time. This provides evidence that
error-based exercises presented in a tablet help children with dyslexia improve
their spelling skills.Comment: 8 pages, ASSETS'14, October 20-22, 2014, Rochester, NY, US
- …