20 research outputs found
Complexity-Aware Assignment of Latent Values in Discriminative Models for Accurate Gesture Recognition
Many of the state-of-the-art algorithms for gesture recognition are based on
Conditional Random Fields (CRFs). Successful approaches, such as the
Latent-Dynamic CRFs, extend the CRF by incorporating latent variables, whose
values are mapped to the values of the labels. In this paper we propose a novel
methodology to set the latent values according to the gesture complexity. We
use an heuristic that iterates through the samples associated with each label
value, stimating their complexity. We then use it to assign the latent values
to the label values. We evaluate our method on the task of recognizing human
gestures from video streams. The experiments were performed in binary datasets,
generated by grouping different labels. Our results demonstrate that our
approach outperforms the arbitrary one in many cases, increasing the accuracy
by up to 10%.Comment: Conference paper published at 2016 29th SIBGRAPI, Conference on
Graphics, Patterns and Images (SIBGRAPI). 8 pages, 7 figure
Discriminative methods for classification of asynchronous imaginary motor tasks from EEG data
In this work, two methods based on statistical models that take into account the temporal changes in the electroencephalographic (EEG) signal are proposed for asynchronous brain-computer interfaces (BCI) based on imaginary motor tasks. Unlike the current approaches to asynchronous BCI systems that make use of windowed versions of the EEG data combined with static classifiers, the methods proposed here are based on discriminative models that allow sequential labeling of data. In particular, the two methods we propose for asynchronous BCI are based on conditional random fields (CRFs) and latent dynamic CRFs (LDCRFs), respectively. We describe how the asynchronous BCI problem can be posed as a classification problem based on CRFs or LDCRFs, by defining appropriate random variables and their relationships. CRF allows modeling the extrinsic dynamics of data, making it possible to model the transitions between classes, which in this context correspond to distinct tasks in an asynchronous BCI system. On the other hand, LDCRF goes beyond this approach by incorporating latent variables that permit modeling the intrinsic structure for each class and at the same time allows modeling extrinsic dynamics. We apply our proposed methods on the publicly available BCI competition III dataset V as well as a data set recorded in our laboratory. Results obtained are compared to the top algorithm in the BCI competition as well as to methods based on hierarchical hidden Markov models (HHMMs), hierarchical hidden CRF (HHCRF), neural networks based on particle swarm optimization (IPSONN) and to a recently proposed approach based on neural networks and fuzzy theory, the S-dFasArt. Our experimental analysis demonstrates the improvements provided by our proposed methods in terms of classification accuracy
Review on Classification Methods used in Image based Sign Language Recognition System
Sign language is the way of communication among the Deaf-Dumb people by expressing signs. This paper is present review on Sign language Recognition system that aims to provide communication way for Deaf and Dumb pople. This paper describes review of Image based sign language recognition system. Signs are in the form of hand gestures and these gestures are identified from images as well as videos. Gestures are identified and classified according to features of Gesture image. Features are like shape, rotation, angle, pixels, hand movement etc. Features are finding by various Features Extraction methods and classified by various machine learning methods. Main pupose of this paper is to review on classification methods of similar systems used in Image based hand gesture recognition . This paper also describe comarison of various system on the base of classification methods and accuracy rate
Hand gesture spotting and recognition using HMMs and CRFs in color image sequences
Magdeburg, Univ., Fak. für Elektrotechnik und Informationstechnik, Diss., 2010von Mahmoud Othman Selim Mahmoud Elmezai
Communication error detection using facial expressions
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 129-135).Automatic detection of communication errors in conversational systems typically rely only on acoustic cues. However, perceptual studies have indicated that speakers do exhibit visual communication error cues passively during the system's conversational turn. In this thesis, we introduce novel algorithms for face and body gesture recognition and present the first automatic system for detecting communication errors using facial expressions during the system's turn. This is useful as it detects communication problems before the user speaks a reply. To detect communication problems accurately and efficiently we develop novel extensions to hidden-state discriminative methods. We also present results that show when human subjects become aware that the conversational system is capable of receiving visual input, they become more communicative visually yet naturally.by Sy Bor Wang.Ph.D
Learning Temporal Alignment Uncertainty for Efficient Event Detection
In this paper we tackle the problem of efficient video event detection. We
argue that linear detection functions should be preferred in this regard due to
their scalability and efficiency during estimation and evaluation. A popular
approach in this regard is to represent a sequence using a bag of words (BOW)
representation due to its: (i) fixed dimensionality irrespective of the
sequence length, and (ii) its ability to compactly model the statistics in the
sequence. A drawback to the BOW representation, however, is the intrinsic
destruction of the temporal ordering information. In this paper we propose a
new representation that leverages the uncertainty in relative temporal
alignments between pairs of sequences while not destroying temporal ordering.
Our representation, like BOW, is of a fixed dimensionality making it easily
integrated with a linear detection function. Extensive experiments on CK+,
6DMG, and UvA-NEMO databases show significant performance improvements across
both isolated and continuous event detection tasks.Comment: Appeared in DICTA 2015, 8 page