31,358 research outputs found
Generic system for human-computer gesture interaction: applications on sign language recognition and robotic soccer refereeing
Hand gestures are a powerful way for human communication, with lots of potential applications in the area of human computer interaction. Vision-based hand gesture recognition techniques have many proven advantages compared with traditional devices, giving users a simpler and more natural way to communicate with electronic devices. This work proposes a generic system architecture based in computer vision and machine learning, able to be used with any interface for human-computer interaction. The proposed solution is mainly composed of three modules: a pre-processing and hand segmentation module, a static gesture interface module and a dynamic gesture interface module. The experiments showed that the core of visionbased interaction systems could be the same for all applications and thus facilitate the implementation. For hand posture recognition, a SVM (Support Vector Machine) model was trained and used, able to achieve a final accuracy of 99.4%. For dynamic gestures, an HMM (Hidden Markov Model) model was trained for each gesture that the system could recognize with a final average accuracy of 93.7%. The proposed solution as the advantage of being generic enough with the trained models able to work in real-time, allowing its application in a wide range of human-machine applications. To validate the proposed framework two applications were implemented. The first one is a real-time system able to interpret the Portuguese Sign Language. The second one is an online system able to help a robotic soccer game referee judge a game in real time
Generic system for human-computer gesture interaction
Hand gestures are a powerful way for human communication, with lots of potential applications in the area of
human computer interaction. Vision-based hand gesture recognition techniques have many proven advantages compared with traditional devices, giving users a simpler and more natural way to communicate with electronic devices. This work proposes a generic system architecture based in computer vision and
machine learning, able to be used with any interface for humancomputer interaction. The proposed solution is mainly composed of three modules: a pre-processing and hand segmentation module, a static gesture interface module and a dynamic gesture interface module. The experiments showed that the core of
vision-based interaction systems can be the same for all applications and thus facilitate the implementation. In order to test the proposed solutions, three prototypes were implemented.
For hand posture recognition, a SVM model was trained and used, able to achieve a final accuracy of 99.4%. For dynamic gestures, an HMM model was trained for each gesture that the system could recognize with a final average accuracy of 93.7%. The proposed solution as the advantage of being generic enough with the trained models able to work in real-time, allowing its application in a wide range of human-machine applications.(undefined
Hand gesture recognition system based in computer vision and machine learning: Applications on human-machine interaction
Tese de Doutoramento em Engenharia de Eletrónica e de ComputadoresSendo uma forma natural de interação homem-máquina, o reconhecimento de gestos
implica uma forte componente de investigação em áreas como a visão por
computador e a aprendizagem computacional. O reconhecimento gestual é uma área
com aplicações muito diversas, fornecendo aos utilizadores uma forma mais natural e
mais simples de comunicar com sistemas baseados em computador, sem a
necessidade de utilização de dispositivos extras. Assim, o objectivo principal da
investigação na área de reconhecimento de gestos aplicada à interacção homemmáquina
é o da criação de sistemas, que possam identificar gestos específicos e usálos
para transmitir informações ou para controlar dispositivos. Para isso as interfaces
baseados em visão para o reconhecimento de gestos, necessitam de detectar a mão de
forma rápida e robusta e de serem capazes de efetuar o reconhecimento de gestos em
tempo real. Hoje em dia, os sistemas de reconhecimento de gestos baseados em visão
são capazes de trabalhar com soluções específicas, construídos para resolver um
determinado problema e configurados para trabalhar de uma forma particular. Este
projeto de investigação estudou e implementou soluções, suficientemente genéricas,
com o recurso a algoritmos de aprendizagem computacional, permitindo a sua
aplicação num conjunto alargado de sistemas de interface homem-máquina, para
reconhecimento de gestos em tempo real. A solução proposta, Gesture Learning
Module Architecture (GeLMA), permite de forma simples definir um conjunto de
comandos que pode ser baseado em gestos estáticos e dinâmicos e que pode ser
facilmente integrado e configurado para ser utilizado numa série de aplicações. É um
sistema de baixo custo e fácil de treinar e usar, e uma vez que é construído
unicamente com bibliotecas de código. As experiências realizadas permitiram
mostrar que o sistema atingiu uma precisão de 99,2% em termos de reconhecimento
de gestos estáticos e uma precisão média de 93,7% em termos de reconhecimento de
gestos dinâmicos. Para validar a solução proposta, foram implementados dois
sistemas completos. O primeiro é um sistema em tempo real capaz de ajudar um
árbitro a arbitrar um jogo de futebol robótico. A solução proposta combina um
sistema de reconhecimento de gestos baseada em visão com a definição de uma linguagem formal, o CommLang Referee, à qual demos a designação de Referee
Command Language Interface System (ReCLIS). O sistema identifica os comandos
baseados num conjunto de gestos estáticos e dinâmicos executados pelo árbitro,
sendo este posteriormente enviado para um interface de computador que transmite a
respectiva informação para os robôs. O segundo é um sistema em tempo real capaz
de interpretar um subconjunto da Linguagem Gestual Portuguesa. As experiências
demonstraram que o sistema foi capaz de reconhecer as vogais em tempo real de
forma fiável. Embora a solução implementada apenas tenha sido treinada para
reconhecer as cinco vogais, o sistema é facilmente extensível para reconhecer o resto
do alfabeto. As experiências também permitiram mostrar que a base dos sistemas de
interação baseados em visão pode ser a mesma para todas as aplicações e, deste
modo facilitar a sua implementação. A solução proposta tem ainda a vantagem de ser
suficientemente genérica e uma base sólida para o desenvolvimento de sistemas
baseados em reconhecimento gestual que podem ser facilmente integrados com
qualquer aplicação de interface homem-máquina. A linguagem formal de definição
da interface pode ser redefinida e o sistema pode ser facilmente configurado e
treinado com um conjunto de gestos diferentes de forma a serem integrados na
solução final.Hand gesture recognition is a natural way of human computer interaction and an area
of very active research in computer vision and machine learning. This is an area with
many different possible applications, giving users a simpler and more natural way to
communicate with robots/systems interfaces, without the need for extra devices. So,
the primary goal of gesture recognition research applied to Human-Computer
Interaction (HCI) is to create systems, which can identify specific human gestures
and use them to convey information or controlling devices. For that, vision-based
hand gesture interfaces require fast and extremely robust hand detection, and gesture
recognition in real time.
Nowadays, vision-based gesture recognition systems are able to work with specific
solutions, built to solve one particular problem and configured to work in a particular
manner. This research project studied and implemented solutions, generic enough,
with the help of machine learning algorithms, allowing its application in a wide
range of human-computer interfaces, for real-time gesture recognition.
The proposed solution, Gesture Learning Module Architecture (GeLMA), allows the
definition in a simple way of a set of commands that can be based on static and
dynamic gestures and that can be easily integrated and configured to be used in a
number of applications. It is easy to train and use, and since it is mainly built with
open source libraries it is also an inexpensive solution. Experiments carried out
showed that the system achieved an accuracy of 99.2% in terms of hand posture
recognition and an average accuracy of 93,72% in terms of dynamic gesture
recognition. To validate the proposed framework, two systems were implemented.
The first one is an online system able to help a robotic soccer game referee judge a
game in real time. The proposed solution combines a vision-based hand gesture
recognition system with a formal language definition, the Referee CommLang, into
what is called the Referee Command Language Interface System (ReCLIS). The
system builds a command based on system-interpreted static and dynamic referee
gestures, and is able to send it to a computer interface which can then transmit the
proper commands to the robots. The second one is an online system able to interpret
the Portuguese Sign Language. The experiments showed that the system was able to reliably recognize the vowels in real-time. Although the implemented solution was
only trained to recognize the five vowels, it is easily extended to recognize the rest of
the alphabet. These experiments also showed that the core of vision-based interaction
systems can be the same for all applications and thus facilitate its implementation.
The proposed framework has the advantage of being generic enough and a solid
foundation for the development of hand gesture recognition systems that can be
integrated in any human-computer interface application. The interface language can
be redefined and the system can be easily configured to train different sets of
gestures that can be easily integrated into the final solution
A real-time human-robot interaction system based on gestures for assistive scenarios
Natural and intuitive human interaction with robotic systems is a key point to develop robots assisting people in an easy and effective way. In this paper, a Human Robot Interaction (HRI) system able to recognize gestures usually employed in human non-verbal communication is introduced, and an in-depth study of its usability is performed. The system deals with dynamic gestures such as waving or nodding which are recognized using a Dynamic Time Warping approach based on gesture specific features computed from depth maps. A static gesture consisting in pointing at an object is also recognized. The pointed location is then estimated in order to detect candidate objects the user may refer to. When the pointed object is unclear for the robot, a disambiguation procedure by means of either a verbal or gestural dialogue is performed. This skill would lead to the robot picking an object in behalf of the user, which could present difficulties to do it by itself. The overall system — which is composed by a NAO and Wifibot robots, a KinectTM v2 sensor and two laptops — is firstly evaluated in a structured lab setup. Then, a broad set of user tests has been completed, which allows to assess correct performance in terms of recognition rates, easiness of use and response times.Postprint (author's final draft
Real time hand gesture recognition including hand segmentation and tracking
In this paper we present a system that performs automatic gesture recognition. The system consists of two main components: (i) A unified technique for segmentation and tracking of face and hands using a skin detection algorithm along with handling occlusion between skin objects to keep track of the status of the occluded parts. This is realized by combining 3 useful features, namely, color, motion and position. (ii) A static and dynamic gesture recognition system. Static gesture recognition is achieved using a robust hand shape classification, based on PCA subspaces, that is invariant to scale along with small translation and rotation transformations. Combining hand shape classification with position information and using DHMMs allows us to accomplish dynamic gesture recognition
RGBD Datasets: Past, Present and Future
Since the launch of the Microsoft Kinect, scores of RGBD datasets have been
released. These have propelled advances in areas from reconstruction to gesture
recognition. In this paper we explore the field, reviewing datasets across
eight categories: semantics, object pose estimation, camera tracking, scene
reconstruction, object tracking, human actions, faces and identification. By
extracting relevant information in each category we help researchers to find
appropriate data for their needs, and we consider which datasets have succeeded
in driving computer vision forward and why.
Finally, we examine the future of RGBD datasets. We identify key areas which
are currently underexplored, and suggest that future directions may include
synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style
- …