Search CORE

515,864 research outputs found

Automatic recognition of fingerspelled words in British Sign Language

Author: Everingham M.
Liwicki S.
Publication venue
Publication date: 01/01/2009
Field of study

We investigate the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet. This is a challenging task since the BSL alphabet involves both hands occluding each other, and contains signs which are ambiguous from the observer’s viewpoint. The main contributions of our work include: (i) recognition based on hand shape alone, not requiring motion cues; (ii) robust visual features for hand shape recognition; (iii) scalability to large lexicon recognition with no re-training. We report results on a dataset of 1,000 low quality webcam videos of 100 words. The proposed method achieves a word recognition accuracy of 98.9%

CiteSeerX

Crossref

White Rose Research Online

Recommended from our members

Word shape analysis for a hybrid recognition system

Author: Powalka RK
Sherkat N
Whitrow RJ
Publication venue: Elsevier (not including Cell Press)
Publication date: 05/04/2003
Field of study

This paper describes two wholistic recognizers developed for use in a hybrid recognition system. The recognizers use information about the word shape. This information is strongly related to word zoning. One of the recognizers is explicitly limited by the accuracy of the zoning information extraction. The other recognizer is designed so as to avoid this limitation. The recognizers use very simple sets of features and fuzzy set based pattern matching techniques. This not only aims to increase their robustness, but also causes problems with disambiguation of the results. A verification mechanism, using letter alternatives as compound features, is introduced. Letter alternatives are obtained from a segmentation based recognizer coexisting in the hybrid system. Despite some remaining disambiguation problems, wholistic recognizers are found capable of outperforming the segmentation based recognizer. When working together in a hybrid system, the results are significantly higher than that of the individual recognizers. Recognition results are reported and compared

Nottingham Trent Institutional Repository (IRep)

Object Recognition Based on Shape and Function: Inspired by Children\u27s Word Acquisition

Author: Eguchi Akihiro
Publication venue: ScholarWorks@UARK
Publication date: 01/01/2012
Field of study

This paper explores a new approach to computational object recognition by borrowing an idea from child language acquisition studies in developmental psychology. Whereas previous image recognition research used shape to recognize and label a target object, the model proposed in this study added the function of the object resulting in a more accurate recognition. This study makes use of new gaming technology, Microsoft’s Kinect, in implementing the proposed new object recognition model. A demonstration of the model developed in this project properly infers different names for similarly shaped objects and the same name for differently shaped objects

ScholarWorks@UARK

Object Recognition Based on Shape and Function: Inspired by Children\u27s Word Acquisition

Author: Eguchi Akihiro
Publication venue: ScholarWorks@UARK
Publication date: 01/01/2012
Field of study

UARK (University of Arkansas )

A Computational Theory of Contextual Knowledge in Machine Reading

Author: Hanlon Stephen James
Publication venue: University of Leeds
Publication date: 01/09/1994
Field of study

Machine recognition of off–line handwriting can be achieved by either recognising words as individual symbols (word level recognition) or by segmenting a word into parts, usually letters, and classifying those parts (letter level recognition). Whichever method is used, current handwriting recognition systems cannot overcome the inherent ambiguity in writingwithout recourse to contextual information. This thesis presents a set of experiments that use Hidden Markov Models of language to resolve ambiguity in the classification process. It goes on to describe an algorithm designed to recognise a document written by a single–author and to improve recognition by adaptingto the writing style and learning new words. Learning and adaptation is achieved by reading the document over several iterations. The algorithm is designed to incorporate contextual processing, adaptation to modify the shape of known words and learning of new words within a constrained dictionary. Adaptation occurs when a word that has previously been trained in the classifier is recognised at either the word or letter level and the word image is used to modify the classifier. Learning occurs when a new word that has not been in the training set is recognised at the letter level and is subsequently added to the classifier. Words and letters are recognised using a nearest neighbour classifier and used features based on the two–dimensional Fourier transform. By incorporating a measure of confidence based on the distribution of training points around an exemplar, adaptation and learning is constrained to only occur when a word is confidently classified. The algorithm was implemented and tested with a dictionary of 1000 words. Results show that adaptation of the letter classifier improved recognition on average by 3.9% with only 1.6% at the whole word level. Two experiments were carried out to evaluate the learning in the system. It was found that learning accounted for little improvement in the classification results and also that learning new words was prone to misclassifications being propagated

White Rose E-theses Online

Exploiting multimedia content : a machine learning based approach

Author: Hassa Ehtesham
Publication venue: 'Universitat Autonoma de Barcelona'
Publication date: 01/01/2014
Field of study

Advisors: Prof. M Gopal, Prof. Santanu Chaudhury. Date and location of PhD thesis defense: 10 September 2013, Indian Institute of Technology DelhiThis thesis explores use of machine learning for multimedia content management involving single/multiple features, modalities and concepts. We introduce shape based feature for binary patterns and apply it for recognition and retrieval application in single and multiple feature based architecture. The multiple feature based recognition and retrieval frameworks are based on the theory of multiple kernel learning (MKL). A binary pattern recognition framework is presented by combining the binary MKL classifiers using a decision directed acyclic graph. The evaluation is shown for Indian script character recognition, and MPEG7 shape symbol recognition. A word image based document indexing framework is presented using the distance based hashing (DBH) defined on learned pivot centres. We use a new multi-kernel learning scheme using a Genetic Algorithm for developing a kernel DBH based document image retrieval system. The experimental evaluation is presented on document collections of Devanagari, Bengali and English scripts. Next, methods for document retrieval using multi-modal information fusion are presented. Text/Graphics segmentation framework is presented for documents having a complex layout. We present a novel multi-modal document retrieval framework using the segmented regions. The approach is evaluated on English magazine pages. A document script identification framework is presented using decision level aggregation of page, paragraph and word level prediction. Latent Dirichlet Allocation based topic modelling with modified edit distance is introduced for the retrieval of documents having recognition inaccuracies. A multi-modal indexing framework for such documents is presented by a learning based combination of text and image based properties. Experimental results are shown on Devanagari script documents. Finally, we have investigated concept based approaches for multimedia analysis. A multi-modal document retrieval framework is presented by combining the generative and discriminative modelling for exploiting the cross-modal correlation between modalities. The combination is also explored for semantic concept recognition using multi-modal components of the same document, and different documents over a collection. An experimental evaluation of the framework is shown for semantic event detection in sport videos, and semantic labelling of components of multi-modal document images

Crossref

Directory of Open Access Journals

Electronic Letters on Computer Vision and Image Analysis (ELCVIA - Universitat Autònoma de Barcelona)

Diposit Digital de Documents de la UAB