Search CORE

322,894 research outputs found

Recommended from our members

Word shape analysis for a hybrid recognition system

Author: Powalka RK
Sherkat N
Whitrow RJ
Publication venue: Elsevier (not including Cell Press)
Publication date: 05/04/2003
Field of study

This paper describes two wholistic recognizers developed for use in a hybrid recognition system. The recognizers use information about the word shape. This information is strongly related to word zoning. One of the recognizers is explicitly limited by the accuracy of the zoning information extraction. The other recognizer is designed so as to avoid this limitation. The recognizers use very simple sets of features and fuzzy set based pattern matching techniques. This not only aims to increase their robustness, but also causes problems with disambiguation of the results. A verification mechanism, using letter alternatives as compound features, is introduced. Letter alternatives are obtained from a segmentation based recognizer coexisting in the hybrid system. Despite some remaining disambiguation problems, wholistic recognizers are found capable of outperforming the segmentation based recognizer. When working together in a hybrid system, the results are significantly higher than that of the individual recognizers. Recognition results are reported and compared

Nottingham Trent Institutional Repository (IRep)

Determining the Unithood of Word Sequences using Mutual Information and Independence Measure

Author: Bennamoun Mohammed
Liu Wei
Wong Wilson
Publication venue
Publication date: 07/02/2008
Field of study

Most works related to unithood were conducted as part of a larger effort for the determination of termhood. Consequently, the number of independent research that study the notion of unithood and produce dedicated techniques for measuring unithood is extremely small. We propose a new approach, independent of any influences of termhood, that provides dedicated measures to gather linguistic evidence from parsed text and statistical evidence from Google search engine for the measurement of unithood. Our evaluations revealed a precision and recall of 98.68% and 91.82% respectively with an accuracy at 95.42% in measuring the unithood of 1005 test cases.Comment: More information is available at http://explorer.csse.uwa.edu.au/reference

arXiv.org e-Print Archive

CiteSeerX

Neural Reranking for Named Entity Recognition

Author: Dong Fei
Yang Jie
Zhang Yue
Publication venue
Publication date: 17/07/2017
Field of study

We propose a neural reranking system for named entity recognition (NER). The basic idea is to leverage recurrent neural network models to learn sentence-level patterns that involve named entity mentions. In particular, given an output sentence produced by a baseline NER model, we replace all entity mentions, such as \textit{Barack Obama}, into their entity types, such as \textit{PER}. The resulting sentence patterns contain direct output information, yet is less sparse without specific named entities. For example, "PER was born in LOC" can be such a pattern. LSTM and CNN structures are utilised for learning deep representations of such sentences for reranking. Results show that our system can significantly improve the NER accuracies over two different baselines, giving the best reported results on a standard benchmark.Comment: Accepted as regular paper by RANLP 201

arXiv.org e-Print Archive

Crossref

Multimedia information technology and the annotation of video

Author: Jong F.M.G. de
Smeulders A.
Worring M.
Publication venue: Stichting Archiefpublicaties
Publication date: 01/01/2006
Field of study

The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning

University of Twente Research Information

A Computer-Based Method to Improve the Spelling of Children with Dyslexia

Author: Centre Creix
Clara Bayarri
Cookie Cloud
Luz Rello
Martin Pielot
Yolanda Otal
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

In this paper we present a method which aims to improve the spelling of children with dyslexia through playful and targeted exercises. In contrast to previous approaches, our method does not use correct words or positive examples to follow, but presents the child a misspelled word as an exercise to solve. We created these training exercises on the basis of the linguistic knowledge extracted from the errors found in texts written by children with dyslexia. To test the effectiveness of this method in Spanish, we integrated the exercises in a game for iPad, DysEggxia (Piruletras in Spanish), and carried out a within-subject experiment. During eight weeks, 48 children played either DysEggxia or Word Search, which is another word game. We conducted tests and questionnaires at the beginning of the study, after four weeks when the games were switched, and at the end of the study. The children who played DysEggxia for four weeks in a row had significantly less writing errors in the tests that after playing Word Search for the same time. This provides evidence that error-based exercises presented in a tablet help children with dyslexia improve their spelling skills.Comment: 8 pages, ASSETS'14, October 20-22, 2014, Rochester, NY, US

arXiv.org e-Print Archive

CiteSeerX

Crossref