1,128 research outputs found
Computational Models for the Automatic Learning and Recognition of Irish Sign Language
This thesis presents a framework for the automatic recognition of Sign Language
sentences. In previous sign language recognition works, the issues of;
user independent recognition, movement epenthesis modeling and automatic
or weakly supervised training have not been fully addressed in a single recognition
framework. This work presents three main contributions in order to
address these issues.
The first contribution is a technique for user independent hand posture
recognition. We present a novel eigenspace Size Function feature which is
implemented to perform user independent recognition of sign language hand
postures.
The second contribution is a framework for the classification and spotting
of spatiotemporal gestures which appear in sign language. We propose a
Gesture Threshold Hidden Markov Model (GT-HMM) to classify gestures
and to identify movement epenthesis without the need for explicit epenthesis
training.
The third contribution is a framework to train the hand posture and spatiotemporal
models using only the weak supervision of sign language videos
and their corresponding text translations. This is achieved through our proposed
Multiple Instance Learning Density Matrix algorithm which automatically
extracts isolated signs from full sentences using the weak and noisy
supervision of text translations. The automatically extracted isolated samples
are then utilised to train our spatiotemporal gesture and hand posture
classifiers.
The work we present in this thesis is an important and significant contribution
to the area of natural sign language recognition as we propose a
robust framework for training a recognition system without the need for
manual labeling
Gesture Recognition and Control for Semi-Autonomous Robotic Assistant Surgeons
The next stage for robotics development is to introduce autonomy and cooperation with human agents in tasks that require high levels of precision and/or that exert considerable physical strain. To guarantee the highest possible safety standards, the best approach is to devise a deterministic automaton that performs identically for each operation. Clearly, such approach inevitably fails to adapt itself to changing environments or different human companions. In a surgical scenario, the highest variability happens for the timing of different actions performed within the same phases. This thesis explores the solutions adopted in pursuing automation in robotic minimally-invasive surgeries (R-MIS) and presents a novel cognitive control architecture that uses a multi-modal neural network trained on a cooperative task performed by human surgeons and produces an action segmentation that provides the required timing for actions while maintaining full phase execution control via a deterministic Supervisory Controller and full execution safety by a velocity-constrained Model-Predictive Controller
Optimized Biosignals Processing Algorithms for New Designs of Human Machine Interfaces on Parallel Ultra-Low Power Architectures
The aim of this dissertation is to explore Human Machine Interfaces (HMIs) in a variety of biomedical scenarios. The research addresses typical challenges in wearable and implantable devices for diagnostic, monitoring, and prosthetic purposes, suggesting a methodology for tailoring such applications to cutting edge embedded architectures.
The main challenge is the enhancement of high-level applications, also introducing Machine Learning (ML) algorithms, using parallel programming and specialized hardware to improve the performance.
The majority of these algorithms are computationally intensive, posing significant challenges for the deployment on embedded devices, which have several limitations in term of memory size, maximum operative frequency, and battery duration.
The proposed solutions take advantage of a Parallel Ultra-Low Power (PULP) architecture, enhancing the elaboration on specific target architectures, heavily optimizing the execution, exploiting software and hardware resources.
The thesis starts by describing a methodology that can be considered a guideline to efficiently implement algorithms on embedded architectures.
This is followed by several case studies in the biomedical field, starting with the analysis of a Hand Gesture Recognition, based on the Hyperdimensional Computing algorithm, which allows performing a fast on-chip re-training, and a comparison with the state-of-the-art Support Vector Machine (SVM); then a Brain Machine Interface (BCI) to detect the respond of the brain to a visual stimulus follows in the manuscript. Furthermore, a seizure detection application is also presented, exploring different solutions for the dimensionality reduction of the input signals. The last part is dedicated to an exploration of typical modules for the development of optimized ECG-based applications
Vision-based portuguese sign language recognition system
Vision-based hand gesture recognition is an area of active current research in computer vision and machine learning. Being a natural way of human interaction, it is an area where many researchers are working on, with the goal of making human computer interaction (HCI) easier and natural, without the need for any extra devices. So, the primary goal of gesture recognition research is to create systems, which can identify specific human gestures and use them, for example, to convey information. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. Hand gestures are a powerful human communication modality with lots of potential applications and in this context we have sign language recognition, the communication method of deaf people. Sign lan- guages are not standard and universal and the grammars differ from country to coun- try. In this paper, a real-time system able to interpret the Portuguese Sign Language is presented and described. Experiments showed that the system was able to reliably recognize the vowels in real-time, with an accuracy of 99.4% with one dataset of fea- tures and an accuracy of 99.6% with a second dataset of features. Although the im- plemented solution was only trained to recognize the vowels, it is easily extended to recognize the rest of the alphabet, being a solid foundation for the development of any vision-based sign language recognition user interface system
Towards Developing an Effective Hand Gesture Recognition System for Human Computer Interaction: A Literature Survey
Gesture recognition is a mathematical analysis of movement of body parts (hand / face) done with the help of computing device. It helps computers to understand human body language and build a more powerful link between humans and machines. Many research works are developed in the field of hand gesture recognition. Each works have achieved different recognition accuracies with different hand gesture datasets, however most of the firms are having insufficient insight to develop necessary achievements to meet their development in real time datasets. Under such circumstances, it is very essential to have a complete knowledge of recognition methods of hand gesture recognition, its strength and weakness and the development criteria as well. Lots of reports declare its work to be better but a complete relative analysis is lacking in these works. In this paper, we provide a study of representative techniques for hand gesture recognition, recognition methods and also presented a brief introduction about hand gesture recognition. The main objective of this work is to highlight the position of various recognition techniqueswhich can indirectly help in developing new techniques for solving the issues in the hand gesture recognition systems. Moreover we present a concise description about the hand gesture recognition systems recognition methods and the instructions for future research
A robust and efficient video representation for action recognition
This paper introduces a state-of-the-art video representation and applies it
to efficient action recognition and detection. We first propose to improve the
popular dense trajectory features by explicit camera motion estimation. More
specifically, we extract feature point matches between frames using SURF
descriptors and dense optical flow. The matches are used to estimate a
homography with RANSAC. To improve the robustness of homography estimation, a
human detector is employed to remove outlier matches from the human body as
human motion is not constrained by the camera. Trajectories consistent with the
homography are considered as due to camera motion, and thus removed. We also
use the homography to cancel out camera motion from the optical flow. This
results in significant improvement on motion-based HOF and MBH descriptors. We
further explore the recent Fisher vector as an alternative feature encoding
approach to the standard bag-of-words histogram, and consider different ways to
include spatial layout information in these encodings. We present a large and
varied set of evaluations, considering (i) classification of short basic
actions on six datasets, (ii) localization of such actions in feature-length
movies, and (iii) large-scale recognition of complex events. We find that our
improved trajectory features significantly outperform previous dense
trajectories, and that Fisher vectors are superior to bag-of-words encodings
for video recognition tasks. In all three tasks, we show substantial
improvements over the state-of-the-art results
Communication error detection using facial expressions
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 129-135).Automatic detection of communication errors in conversational systems typically rely only on acoustic cues. However, perceptual studies have indicated that speakers do exhibit visual communication error cues passively during the system's conversational turn. In this thesis, we introduce novel algorithms for face and body gesture recognition and present the first automatic system for detecting communication errors using facial expressions during the system's turn. This is useful as it detects communication problems before the user speaks a reply. To detect communication problems accurately and efficiently we develop novel extensions to hidden-state discriminative methods. We also present results that show when human subjects become aware that the conversational system is capable of receiving visual input, they become more communicative visually yet naturally.by Sy Bor Wang.Ph.D
Generic system for human-computer gesture interaction: applications on sign language recognition and robotic soccer refereeing
Hand gestures are a powerful way for human communication, with lots of potential applications in the area of human computer interaction. Vision-based hand gesture recognition techniques have many proven advantages compared with traditional devices, giving users a simpler and more natural way to communicate with electronic devices. This work proposes a generic system architecture based in computer vision and machine learning, able to be used with any interface for human-computer interaction. The proposed solution is mainly composed of three modules: a pre-processing and hand segmentation module, a static gesture interface module and a dynamic gesture interface module. The experiments showed that the core of visionbased interaction systems could be the same for all applications and thus facilitate the implementation. For hand posture recognition, a SVM (Support Vector Machine) model was trained and used, able to achieve a final accuracy of 99.4%. For dynamic gestures, an HMM (Hidden Markov Model) model was trained for each gesture that the system could recognize with a final average accuracy of 93.7%. The proposed solution as the advantage of being generic enough with the trained models able to work in real-time, allowing its application in a wide range of human-machine applications. To validate the proposed framework two applications were implemented. The first one is a real-time system able to interpret the Portuguese Sign Language. The second one is an online system able to help a robotic soccer game referee judge a game in real time
- …