Search CORE

136,459 research outputs found

Gesture and Action Recognition by Evolved Dynamic Subgestures

Author: Baró Xavier
Escalante Hugo Jair
Escalera Sergio
Ponce-López Víctor
Publication venue: British Machine Vision Conference 2015
Publication date: 01/01/2015
Field of study

This paper introduces a framework for gesture and action recognition based on the evolution of temporal gesture primitives, or subgestures. Our work is inspired on the principle of producing genetic variations within a population of gesture subsequences, with the goal of obtaining a set of gesture units that enhance the generalization capability of standard gesture recognition approaches. In our context, gesture primitives are evolved over time using dynamic programming and generative models in order to recognize complex actions. In few generations, the proposed subgesture-based representation of actions and gestures outperforms the state of the art results on the MSRDaily3D and MSRAction3D datasets

UCL Discovery

Towards an Architecture Model for Emotion Recognition in Interactive Systems: Application to a Ballet Dance Show

Author: Clay Alexis
Couture Nadine
Nigay Laurence
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

International audienceIn the context of the very dynamic and challenging domain of affective computing, we adopt a software engineering point of view on emotion recognition in interactive systems. Our goal is threefold: first, developing an architecture model for emotion recognition. This architecture model emphasizes multimodality and reusability. Second, developing a prototype based on this architecture model. For this prototype we focus on gesture-based emotion recognition. And third, using this prototype for augmenting a ballet dance show. We hence describe an overview of our work so far, from the design of a flexible and multimodal emotion recognition architecture model, to a presentation of a gesture-based emotion recognition prototype based on this model, to a prototype that augments a ballet stage, taking emotions as inputs

Hal - Université Grenoble Alpes

The ILGDB database of realistic pen-based gestural commands

Author: Anquetil Eric
Delaye Adrien
Li Peiyu
Renau-Ferrer Ney
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceIn this paper, we introduce the Intuidoc-Loustic Gestures DataBase (ILGDB), a new publicly available database of realistic pen-based gestures for evaluation of recognition systems in pen-enabled interfaces. ILGDB was collected in a real world context and in an immersive environment. As it contains a large number of unconstrained user-defined gestures, ILGDB offers a unique diversity of content that is likely to serve as a precious tool for benchmarking of gesture recognition systems. We report first baseline experimental results on the task of Writer-Dependent gesture recognition

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-CIRAD

HAL-Rennes 1

Towards gestural understanding for intelligent robots

Author: Fritsch Jan Nikolaus
Publication venue: Universität Bielefeld
Publication date: 01/01/2012
Field of study

Fritsch JN. Towards gestural understanding for intelligent robots. Bielefeld: Universität Bielefeld; 2012.A strong driving force of scientific progress in the technical sciences is the quest for systems that assist humans in their daily life and make their life easier and more enjoyable. Nowadays smartphones are probably the most typical instances of such systems. Another class of systems that is getting increasing attention are intelligent robots. Instead of offering a smartphone touch screen to select actions, these systems are intended to offer a more natural human-machine interface to their users. Out of the large range of actions performed by humans, gestures performed with the hands play a very important role especially when humans interact with their direct surrounding like, e.g., pointing to an object or manipulating it. Consequently, a robot has to understand such gestures to offer an intuitive interface. Gestural understanding is, therefore, a key capability on the way to intelligent robots. This book deals with vision-based approaches for gestural understanding. Over the past two decades, this has been an intensive field of research which has resulted in a variety of algorithms to analyze human hand motions. Following a categorization of different gesture types and a review of other sensing techniques, the design of vision systems that achieve hand gesture understanding for intelligent robots is analyzed. For each of the individual algorithmic steps – hand detection, hand tracking, and trajectory-based gesture recognition – a separate Chapter introduces common techniques and algorithms and provides example methods. The resulting recognition algorithms are considering gestures in isolation and are often not sufficient for interacting with a robot who can only understand such gestures when incorporating the context like, e.g., what object was pointed at or manipulated. Going beyond a purely trajectory-based gesture recognition by incorporating context is an important prerequisite to achieve gesture understanding and is addressed explicitly in a separate Chapter of this book. Two types of context, user-provided context and situational context, are reviewed and existing approaches to incorporate context for gestural understanding are reviewed. Example approaches for both context types provide a deeper algorithmic insight into this field of research. An overview of recent robots capable of gesture recognition and understanding summarizes the currently realized human-robot interaction quality. The approaches for gesture understanding covered in this book are manually designed while humans learn to recognize gestures automatically during growing up. Promising research targeted at analyzing developmental learning in children in order to mimic this capability in technical systems is highlighted in the last Chapter completing this book as this research direction may be highly influential for creating future gesture understanding systems

Publications at Bielefeld University

End-to-End Multiview Gesture Recognition for Autonomous Car Parking System

Author: Ben Amara Hassene
Publication venue: 'University of Waterloo'
Publication date: 10/05/2019
Field of study

The use of hand gestures can be the most intuitive human-machine interaction medium. The early approaches for hand gesture recognition used device-based methods. These methods use mechanical or optical sensors attached to a glove or markers, which hinders the natural human-machine communication. On the other hand, vision-based methods are not restrictive and allow for a more spontaneous communication without the need of an intermediary between human and machine. Therefore, vision gesture recognition has been a popular area of research for the past thirty years. Hand gesture recognition finds its application in many areas, particularly the automotive industry where advanced automotive human-machine interface (HMI) designers are using gesture recognition to improve driver and vehicle safety. However, technology advances go beyond active/passive safety and into convenience and comfort. In this context, one of America’s big three automakers has partnered with the Centre of Pattern Analysis and Machine Intelligence (CPAMI) at the University of Waterloo to investigate expanding their product segment through machine learning to provide an increased driver convenience and comfort with the particular application of hand gesture recognition for autonomous car parking. In this thesis, we leverage the state-of-the-art deep learning and optimization techniques to develop a vision-based multiview dynamic hand gesture recognizer for self-parking system. We propose a 3DCNN gesture model architecture that we train on a publicly available hand gesture database. We apply transfer learning methods to fine-tune the pre-trained gesture model on a custom-made data, which significantly improved the proposed system performance in real world environment. We adapt the architecture of the end-to-end solution to expand the state of the art video classifier from a single image as input (fed by monocular camera) to a multiview 360 feed, offered by a six cameras module. Finally, we optimize the proposed solution to work on a limited resources embedded platform (Nvidia Jetson TX2) that is used by automakers for vehicle-based features, without sacrificing the accuracy robustness and real time functionality of the system

University of Waterloo's Institutional Repository

Robotic gesture recognition

Author: Malsburg Christoph von der
Triesch Jochen
Publication venue
Publication date: 07/11/2005
Field of study

Robots of the future should communicate with humans in a natural way. We are especially interested in vision-based gesture interfaces. In the context of robotics several constraints exist, which make the task of gesture recognition particularly challenging. We discuss these constraints and report on progress being made in our lab in the development of techniques for building robust gesture interfaces which can handle these constraints. In an example application, the techniques are shown to be easily combined to build a gesture interface for a real robot grasping objects on a table in front of it

Hochschulschriftenserver - Universität Frankfurt am Main

GestureGPT: Zero-shot Interactive Gesture Understanding and Grounding with Large Language Model Agents

Author: Chen Yiqiang
Wang Xiaoyu
Yu Chun
Zeng Xin
Zhang Tengxiang
Zhao Shengdong
Publication venue
Publication date: 29/10/2023
Field of study

Current gesture recognition systems primarily focus on identifying gestures within a predefined set, leaving a gap in connecting these gestures to interactive GUI elements or system functions (e.g., linking a 'thumb-up' gesture to a 'like' button). We introduce GestureGPT, a novel zero-shot gesture understanding and grounding framework leveraging large language models (LLMs). Gesture descriptions are formulated based on hand landmark coordinates from gesture videos and fed into our dual-agent dialogue system. A gesture agent deciphers these descriptions and queries about the interaction context (e.g., interface, history, gaze data), which a context agent organizes and provides. Following iterative exchanges, the gesture agent discerns user intent, grounding it to an interactive function. We validated the gesture description module using public first-view and third-view gesture datasets and tested the whole system in two real-world settings: video streaming and smart home IoT control. The highest zero-shot Top-5 grounding accuracies are 80.11% for video streaming and 90.78% for smart home tasks, showing potential of the new gesture understanding paradigm

arXiv.org e-Print Archive

Vision-based portuguese sign language recognition system

Author: A. Blum
A. Mannini
B.M. Faria
D. Kelly
D. Zhang
D.E. King
F.-S. Chen
G.R.S. Murthy
H. Cooper
H. Kauppinen
J.P. Wachs
K. Oka
L. Montesano
M. Oshita
N.E. Gillian
P.K. Vijay
R. Almeida
R.A. Braga
S. Bourennane
Z. Zafrulla
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Vision-based hand gesture recognition is an area of active current research in computer vision and machine learning. Being a natural way of human interaction, it is an area where many researchers are working on, with the goal of making human computer interaction (HCI) easier and natural, without the need for any extra devices. So, the primary goal of gesture recognition research is to create systems, which can identify specific human gestures and use them, for example, to convey information. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. Hand gestures are a powerful human communication modality with lots of potential applications and in this context we have sign language recognition, the communication method of deaf people. Sign lan- guages are not standard and universal and the grammars differ from country to coun- try. In this paper, a real-time system able to interpret the Portuguese Sign Language is presented and described. Experiments showed that the system was able to reliably recognize the vowels in real-time, with an accuracy of 99.4% with one dataset of fea- tures and an accuracy of 99.6% with a second dataset of features. Although the im- plemented solution was only trained to recognize the vowels, it is easily extended to recognize the rest of the alphabet, being a solid foundation for the development of any vision-based sign language recognition user interface system

Universidade do Minho: RepositoriUM

Crossref