Search CORE

960 research outputs found

Linguistically-driven framework for computationally efficient and scalable sign recognition

Author: Dilsizian Mark
Metaxas Dimitris N.
Neidle Carol
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2018
Field of study

We introduce a new general framework for sign recognition from monocular video using limited quantities of annotated data. The novelty of the hybrid framework we describe here is that we exploit state-of-the art learning methods while also incorporating features based on what we know about the linguistic composition of lexical signs. In particular, we analyze hand shape, orientation, location, and motion trajectories, and then use CRFs to combine this linguistically significant information for purposes of sign recognition. Our robust modeling and recognition of these sub-components of sign production allow an efficient parameterization of the sign recognition problem as compared with purely data-driven methods. This parameterization enables a scalable and extendable time-series learning approach that advances the state of the art in sign recognition, as shown by the results reported here for recognition of isolated, citation-form, lexical signs from American Sign Language (ASL)

Boston University Institutional Repository (OpenBU)

A system for learning statistical motion patterns

Author: Fu Z.
Hu W.
Maybank Stephen J.
Tan T.
Xiao X.
Xie D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Analysis of motion patterns is an effective approach for anomaly detection and behavior prediction. Current approaches for the analysis of motion patterns depend on known scenes, where objects move in predefined ways. It is highly desirable to automatically construct object motion patterns which reflect the knowledge of the scene. In this paper, we present a system for automatically learning motion patterns for anomaly detection and behavior prediction based on a proposed algorithm for robustly tracking multiple objects. In the tracking algorithm, foreground pixels are clustered using a fast accurate fuzzy k-means algorithm. Growing and prediction of the cluster centroids of foreground pixels ensure that each cluster centroid is associated with a moving object in the scene. In the algorithm for learning motion patterns, trajectories are clustered hierarchically using spatial and temporal information and then each motion pattern is represented with a chain of Gaussian distributions. Based on the learned statistical motion patterns, statistical methods are used to detect anomalies and predict behaviors. Our system is tested using image sequences acquired, respectively, from a crowded real traffic scene and a model traffic scene. Experimental results show the robustness of the tracking algorithm, the efficiency of the algorithm for learning motion patterns, and the encouraging performance of algorithms for anomaly detection and behavior prediction

CiteSeerX

Crossref

Birkbeck Institutional Research Online

A system for learning statistical motion patterns

Author: Fu Z.
Hu W.
Maybank Stephen J.
Tan T.
Xiao X.
Xie D.
Publication venue: IEEE Computer Society
Publication date: 01/01/2006
Field of study

CiteSeerX

Crossref

Southampton (e-Prints Soton)

Birkbeck Institutional Research Online

Detection of major ASL sign types in continuous signing for ASL recognition

Author: Metaxas Dimitris
Neidle Carol
Yanovich Polina
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2016
Field of study

In American Sign Language (ASL) as well as other signed languages, different classes of signs (e.g., lexical signs, fingerspelled signs, and classifier constructions) have different internal structural properties. Continuous sign recognition accuracy can be improved through use of distinct recognition strategies, as well as different training datasets, for each class of signs. For these strategies to be applied, continuous signing video needs to be segmented into parts corresponding to particular classes of signs. In this paper we present a multiple instance learning-based segmentation system that accurately labels 91.27% of the video frames of 500 continuous utterances (including 7 different subjects) from the publicly accessible NCSLGR corpus (Neidle and Vogler, 2012). The system uses novel feature descriptors derived from both motion and shape statistics of the regions of high local motion. The system does not require a hand tracker

Boston University Institutional Repository (OpenBU)

Continuous Action Recognition Based on Sequence Alignment

Author: Cech Jan
Evangelidis Georgios
Horaud Radu
Kulkarni Kaustubh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/06/2014
Field of study

Continuous action recognition is more challenging than isolated recognition because classification and segmentation must be simultaneously carried out. We build on the well known dynamic time warping (DTW) framework and devise a novel visual alignment technique, namely dynamic frame warping (DFW), which performs isolated recognition based on per-frame representation of videos, and on aligning a test sequence with a model sequence. Moreover, we propose two extensions which enable to perform recognition concomitant with segmentation, namely one-pass DFW and two-pass DFW. These two methods have their roots in the domain of continuous recognition of speech and, to the best of our knowledge, their extension to continuous visual action recognition has been overlooked. We test and illustrate the proposed techniques with a recently released dataset (RAVEL) and with two public-domain datasets widely used in action recognition (Hollywood-1 and Hollywood-2). We also compare the performances of the proposed isolated and continuous recognition algorithms with several recently published methods

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Scalable ASL sign recognition using model-based machine learning and linguistically annotated corpora

Author: Dilsizian Mark
Metaxas Dimitris
Neidle Carol
Publication venue: European Language Resources Association (ELRA)
Publication date: 12/05/2018
Field of study

We report on the high success rates of our new, scalable, computational approach for sign recognition from monocular video, exploiting linguistically annotated ASL datasets with multiple signers. We recognize signs using a hybrid framework combining state-of-the-art learning methods with features based on what is known about the linguistic composition of lexical signs. We model and recognize the sub-components of sign production, with attention to hand shape, orientation, location, motion trajectories, plus non-manual features, and we combine these within a CRF framework. The effect is to make the sign recognition problem robust, scalable, and feasible with relatively smaller datasets than are required for purely data-driven methods. From a 350-sign vocabulary of isolated, citation-form lexical signs from the American Sign Language Lexicon Video Dataset (ASLLVD), including both 1- and 2-handed signs, we achieve a top-1 accuracy of 93.3% and a top-5 accuracy of 97.9%. The high probability with which we can produce 5 sign candidates that contain the correct result opens the door to potential applications, as it is reasonable to provide a sign lookup functionality that offers the user 5 possible signs, in decreasing order of likelihood, with the user then asked to select the desired sign

Boston University Institutional Repository (OpenBU)

Sign Language Tutoring Tool

Author: Akarun Lale
Aran Oya
Ari Ismail
Benoit Alexandre
Campr Pavel
Caplier Alice
Carrillo Ana Huerta
Fanard François-Xavier
Rombaut Michele
Sankur Bulent
Publication venue
Publication date: 01/01/2007
Field of study

In this project, we have developed a sign language tutor that lets users learn isolated signs by watching recorded videos and by trying the same signs. The system records the user's video and analyses it. If the sign is recognized, both verbal and animated feedback is given to the user. The system is able to recognize complex signs that involve both hand gestures and head movements and expressions. Our performance tests yield a 99% recognition rate on signs involving only manual gestures and 85% recognition rate on signs that involve both manual and non manual components, such as head movement and facial expressions.Comment: eNTERFACE'06. Summer Workshop. on Multimodal Interfaces, Dubrovnik : Croatie (2007

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes