Search CORE

5 research outputs found

Hand Posture Classification and Recognition using the Modified Census Transform

Author: Just Agnès
Marcel Sébastien
Rodriguez Yann
Publication venue
Publication date: 10/03/2006
Field of study

Developing new techniques for human-computer interaction is very challenging. Vision-based techniques have the advantage of being unobtrusive and hands are a natural device that can be used for more intuitive interfaces. But in order to use hands for interaction, it is necessary to be able to recognize them in images. In this paper, we propose to apply to the hand posture classification and recognition tasks an approach that has been successfully used for face detection~\cite{Froba04}. The features are based on the Modified Census Transform and are illumination invariant. For the classification and recognition processes, a simple linear classifier is trained, using a set of feature lookup-tables. The database used for the experiments is a benchmark database in the field of posture recognition. Two protocols have been defined. We provide results following these two protocols for both the classification and recognition tasks. Results are very encouraging

Infoscience - École polytechnique fédérale de Lausanne

Recommended from our members

Multimodal biometrics score level fusion using non-confidence information

Author: Chaw Poh C
Publication venue
Publication date: 01/01/2011
Field of study

Multimodal biometrics refers to automatic authentication methods that depend on multiple modalities of measurable physical characteristics. It alleviates most of the restrictions of single biometrics. To combine the multimodal biometrics scores, three different categories of fusion approaches including rule based, classification based and density based approaches are available. When choosing an approach, one has to consider not only the fusion performance, but also system requirements and other circumstances. In the context of verification, classification errors arise from samples in the overlapping region (or non- confidence region) between genuine users and impostors. In score space, a further separation of the samples outside the non-confidence region does not result in further verification improvements. Therefore, information contained in the non-confidence region might be useful for improving the fusion process. Up to this point, no attempts are reported in the literature that tries to enhance the fusion process using this additional information. In this work, the use of this information is explored in rule based and density based approaches mentioned above

Nottingham Trent Institutional Repository (IRep)

Visual classification of co-verbal gestures for gesture understanding

Author: Campbell Lee Winston
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2001
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2001.Includes bibliographical references (leaves 86-92).A person's communicative intent can be better understood by either a human or a machine if the person's gestures are understood. This thesis project demonstrates an expansion of both the range of co-verbal gestures a machine can identify, and the range of communicative intents the machine can infer. We develop an automatic system that uses realtime video as sensory input and then segments, classifies, and responds to co-verbal gestures made by users in realtime as they converse with a synthetic character known as REA, which is being developed in parallel by Justine Cassell and her students at the MIT Media Lab. A set of 670 natural gestures, videotaped and visually tracked in the course of conversational interviews and then hand segmented and annotated according to a widely used gesture classification scheme, is used in an offline training process that trains Hidden Markov Model classifiers. A number of feature sets are extracted and tested in the offline training process, and the best performer is employed in an online HMM segmenter and classifier that requires no encumbering attachments to the user. Modifications made to the REA system enable REA to respond to the user's beat and deictic gestures as well as turntaking requests the user may convey in gesture.(cont.) The recognition results obtained are far above chance, but too low for use in a production recognition system. The results provide a measure of validity for the gesture categories chosen, and they provide positive evidence for an appealing but difficult to prove proposition: to the extent that a machine can recognize and use these categories of gestures to infer information not present in the words spoken, there is exploitable complementary information in the gesture stream.by Lee Winston Campbell.Ph.D

DSpace@MIT

Face and Hand Gesture Recognition Using Hybrid Classifiers

Author: Harry Wechsler
Ibrahim F. Imam
Jeffrey Huang
Srinivas Gutta
Publication venue: IEEE Press
Publication date
Field of study

This paper advances the methodology of hybrid classification architectures for face and hand gesture recognition tasks and shows their feasibility through experimental studies using the FERET data base and gesture images. The hybrid architecture, consisting of an ensemble of connectionist networks - radial basis functions (RBF) - and inductive decision trees (DT), combines the merits of 'holistic' template matching with those of 'abstractive' matching using discrete features and subject to both positive and negative learning. The hybrid architecture, quite general as it applies to both face and hand gesture recognition, derives its robustness from (i) consensus using ensembles of RBF networks, and (ii) flexible matching using categorical classification via decision trees. The experimental results, proving the feasibility of our approach, yield (i) 93 % accuracy, using cross validation, for contents-based image retrieval (CBIR) subject to correct ID matching tasks, such as 'find Joe Smi..

CiteSeerX

Computergestützte Inhaltsanalyse von digitalen Videoarchiven

Author: Kopf Stephan
Publication venue: Universität Mannheim
Publication date: 01/01/2006
Field of study

Der Übergang von analogen zu digitalen Videos hat in den letzten Jahren zu großen Veränderungen innerhalb der Filmarchive geführt. Insbesondere durch die Digitalisierung der Filme ergeben sich neue Möglichkeiten für die Archive. Eine Abnutzung oder Alterung der Filmrollen ist ausgeschlossen, so dass die Qualität unverändert erhalten bleibt. Zudem wird ein netzbasierter und somit deutlich einfacherer Zugriff auf die Videos in den Archiven möglich. Zusätzliche Dienste stehen den Archivaren und Anwendern zur Verfügung, die erweiterte Suchmöglichkeiten bereitstellen und die Navigation bei der Wiedergabe erleichtern. Die Suche innerhalb der Videoarchive erfolgt mit Hilfe von Metadaten, die weitere Informationen über die Videos zur Verfügung stellen. Ein großer Teil der Metadaten wird manuell von Archivaren eingegeben, was mit einem großen Zeitaufwand und hohen Kosten verbunden ist. Durch die computergestützte Analyse eines digitalen Videos ist es möglich, den Aufwand bei der Erzeugung von Metadaten für Videoarchive zu reduzieren. Im ersten Teil dieser Dissertation werden neue Verfahren vorgestellt, um wichtige semantische Inhalte der Videos zu erkennen. Insbesondere werden neu entwickelte Algorithmen zur Erkennung von Schnitten, der Analyse der Kamerabewegung, der Segmentierung und Klassifikation von Objekten, der Texterkennung und der Gesichtserkennung vorgestellt. Die automatisch ermittelten semantischen Informationen sind sehr wertvoll, da sie die Arbeit mit digitalen Videoarchiven erleichtern. Die Informationen unterstützen nicht nur die Suche in den Archiven, sondern führen auch zur Entwicklung neuer Anwendungen, die im zweiten Teil der Dissertation vorgestellt werden. Beispielsweise können computergenerierte Zusammenfassungen von Videos erzeugt oder Videos automatisch an die Eigenschaften eines Abspielgerätes angepasst werden. Ein weiterer Schwerpunkt dieser Dissertation liegt in der Analyse historischer Filme. Vier europäische Filmarchive haben eine große Anzahl historischer Videodokumentationen zur Verfügung gestellt, welche Anfang bis Mitte des letzten Jahrhunderts gedreht und in den letzten Jahren digitalisiert wurden. Durch die Lagerung und Abnutzung der Filmrollen über mehrere Jahrzehnte sind viele Videos stark verrauscht und enthalten deutlich sichtbare Bildfehler. Die Bildqualität der historischen Schwarz-Weiß-Filme unterscheidet sich signifikant von der Qualität aktueller Videos, so dass eine verlässliche Analyse mit bestehenden Verfahren häufig nicht möglich ist. Im Rahmen dieser Dissertation werden neue Algorithmen vorgestellt, um eine zuverlässige Erkennung von semantischen Inhalten auch in historischen Videos zu ermöglichen

MAnnheim DOCument Server