175 research outputs found

    Real-time Immersive human-computer interaction based on tracking and recognition of dynamic hand gestures

    Get PDF
    With fast developing and ever growing use of computer based technologies, human-computer interaction (HCI) plays an increasingly pivotal role. In virtual reality (VR), HCI technologies provide not only a better understanding of three-dimensional shapes and spaces, but also sensory immersion and physical interaction. With the hand based HCI being a key HCI modality for object manipulation and gesture based communication, challenges are presented to provide users a natural, intuitive, effortless, precise, and real-time method for HCI based on dynamic hand gestures, due to the complexity of hand postures formed by multiple joints with high degrees-of-freedom, the speed of hand movements with highly variable trajectories and rapid direction changes, and the precision required for interaction between hands and objects in the virtual world. Presented in this thesis is the design and development of a novel real-time HCI system based on a unique combination of a pair of data gloves based on fibre-optic curvature sensors to acquire finger joint angles, a hybrid tracking system based on inertia and ultrasound to capture hand position and orientation, and a stereoscopic display system to provide an immersive visual feedback. The potential and effectiveness of the proposed system is demonstrated through a number of applications, namely, hand gesture based virtual object manipulation and visualisation, hand gesture based direct sign writing, and hand gesture based finger spelling. For virtual object manipulation and visualisation, the system is shown to allow a user to select, translate, rotate, scale, release and visualise virtual objects (presented using graphics and volume data) in three-dimensional space using natural hand gestures in real-time. For direct sign writing, the system is shown to be able to display immediately the corresponding SignWriting symbols signed by a user using three different signing sequences and a range of complex hand gestures, which consist of various combinations of hand postures (with each finger open, half-bent, closed, adduction and abduction), eight hand orientations in horizontal/vertical plans, three palm facing directions, and various hand movements (which can have eight directions in horizontal/vertical plans, and can be repetitive, straight/curve, clockwise/anti-clockwise). The development includes a special visual interface to give not only a stereoscopic view of hand gestures and movements, but also a structured visual feedback for each stage of the signing sequence. An excellent basis is therefore formed to develop a full HCI based on all human gestures by integrating the proposed system with facial expression and body posture recognition methods. Furthermore, for finger spelling, the system is shown to be able to recognise five vowels signed by two hands using the British Sign Language in real-time

    A Tangible Solution for Hand Motion Tracking in Clinical Applications

    Get PDF
    Objective real-time assessment of hand motion is crucial in many clinical applications including technically-assisted physical rehabilitation of the upper extremity. We propose an inertial-sensor-based hand motion tracking system and a set of dual-quaternion-based methods for estimation of finger segment orientations and fingertip positions. The proposed system addresses the specific requirements of clinical applications in two ways: (1) In contrast to glove-based approaches, the proposed solution maintains the sense of touch. (2) In contrast to previous work, the proposed methods avoid the use of complex calibration procedures, which means that they are suitable for patients with severe motor impairment of the hand. To overcome the limited significance of validation in lab environments with homogeneous magnetic fields, we validate the proposed system using functional hand motions in the presence of severe magnetic disturbances as they appear in realistic clinical settings. We show that standard sensor fusion methods that rely on magnetometer readings may perform well in perfect laboratory environments but can lead to more than 15 cm root-mean-square error for the fingertip distances in realistic environments, while our advanced method yields root-mean-square errors below 2 cm for all performed motions.DFG, 414044773, Open Access Publizieren 2019 - 2020 / Technische Universität Berli

    Toward natural interaction in the real world: real-time gesture recognition

    Get PDF
    Using a new hand tracking technology capable of tracking 3D hand postures in real-time, we developed a recognition system for continuous natural gestures. By natural gestures, we mean those encountered in spontaneous interaction, rather than a set of artificial gestures chosen to simplify recognition. To date we have achieved 95.6% accuracy on isolated gesture recognition, and 73% recognition rate on continuous gesture recognition, with data from 3 users and twelve gesture classes. We connected our gesture recognition system to Google Earth, enabling real time gestural control of a 3D map. We describe the challenges of signal accuracy and signal interpretation presented by working in a real-world environment, and detail how we overcame them.National Science Foundation (U.S.) (award IIS-1018055)Pfizer Inc.Foxconn Technolog

    Real-time immersive human-computer interaction based on tracking and recognition of dynamic hand gestures

    Get PDF
    With fast developing and ever growing use of computer based technologies, human-computer interaction (HCI) plays an increasingly pivotal role. In virtual reality (VR), HCI technologies provide not only a better understanding of three-dimensional shapes and spaces, but also sensory immersion and physical interaction. With the hand based HCI being a key HCI modality for object manipulation and gesture based communication, challenges are presented to provide users a natural, intuitive, effortless, precise, and real-time method for HCI based on dynamic hand gestures, due to the complexity of hand postures formed by multiple joints with high degrees-of-freedom, the speed of hand movements with highly variable trajectories and rapid direction changes, and the precision required for interaction between hands and objects in the virtual world. Presented in this thesis is the design and development of a novel real-time HCI system based on a unique combination of a pair of data gloves based on fibre-optic curvature sensors to acquire finger joint angles, a hybrid tracking system based on inertia and ultrasound to capture hand position and orientation, and a stereoscopic display system to provide an immersive visual feedback. The potential and effectiveness of the proposed system is demonstrated through a number of applications, namely, hand gesture based virtual object manipulation and visualisation, hand gesture based direct sign writing, and hand gesture based finger spelling. For virtual object manipulation and visualisation, the system is shown to allow a user to select, translate, rotate, scale, release and visualise virtual objects (presented using graphics and volume data) in three-dimensional space using natural hand gestures in real-time. For direct sign writing, the system is shown to be able to display immediately the corresponding SignWriting symbols signed by a user using three different signing sequences and a range of complex hand gestures, which consist of various combinations of hand postures (with each finger open, half-bent, closed, adduction and abduction), eight hand orientations in horizontal/vertical plans, three palm facing directions, and various hand movements (which can have eight directions in horizontal/vertical plans, and can be repetitive, straight/curve, clockwise/anti-clockwise). The development includes a special visual interface to give not only a stereoscopic view of hand gestures and movements, but also a structured visual feedback for each stage of the signing sequence. An excellent basis is therefore formed to develop a full HCI based on all human gestures by integrating the proposed system with facial expression and body posture recognition methods. Furthermore, for finger spelling, the system is shown to be able to recognise five vowels signed by two hands using the British Sign Language in real-time.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    A SDK improvement towards gesture support

    Get PDF
    Human-Computer Interaction have been one of the main focus of the technological community, specially the Natural User Interfaces (NUI) field of research as, since the launch of the Kinect Sensor, the goal to achieve fully natural interfaces just got a lot closer to reality. Taking advantage of this conditions the following research work proposes to compute the hand skeleton in order to recognize Sign Language Shapes. The proposed solution uses the Kinect Sensor to achieve a good segmentation and image analysis algorithms to extend the skeleton from the extraction of high-level features. In order to recognize complex hand shapes the current research work proposes the redefinition of the hand contour making it immutable to translation, rotation and scaling operations, and a set of tools to achieve a good recognition. The validation of the proposed solution extended the Kinects Software Development Kit to allow the developer to access the new set of inferred points and created a template-matching based platform that uses the contour to define the hand shape, this prototype was tested in a set of predefined conditions and showed to have a good success ration and has proven to be eligible for real-time scenarios

    Real-time 3D hand reconstruction in challenging scenes from a single color or depth camera

    Get PDF
    Hands are one of the main enabling factors for performing complex tasks and humans naturally use them for interactions with their environment. Reconstruction and digitization of 3D hand motion opens up many possibilities for important applications. Hands gestures can be directly used for human–computer interaction, which is especially relevant for controlling augmented or virtual reality (AR/VR) devices where immersion is of utmost importance. In addition, 3D hand motion capture is a precondition for automatic sign-language translation, activity recognition, or teaching robots. Different approaches for 3D hand motion capture have been actively researched in the past. While being accurate, gloves and markers are intrusive and uncomfortable to wear. Hence, markerless hand reconstruction based on cameras is desirable. Multi-camera setups provide rich input, however, they are hard to calibrate and lack the flexibility for mobile use cases. Thus, the majority of more recent methods uses a single color or depth camera which, however, makes the problem harder due to more ambiguities in the input. For interaction purposes, users need continuous control and immediate feedback. This means the algorithms have to run in real time and be robust in uncontrolled scenes. These requirements, achieving 3D hand reconstruction in real time from a single camera in general scenes, make the problem significantly more challenging. While recent research has shown promising results, current state-of-the-art methods still have strong limitations. Most approaches only track the motion of a single hand in isolation and do not take background-clutter or interactions with arbitrary objects or the other hand into account. The few methods that can handle more general and natural scenarios run far from real time or use complex multi-camera setups. Such requirements make existing methods unusable for many aforementioned applications. This thesis pushes the state of the art for real-time 3D hand tracking and reconstruction in general scenes from a single RGB or depth camera. The presented approaches explore novel combinations of generative hand models, which have been used successfully in the computer vision and graphics community for decades, and powerful cutting-edge machine learning techniques, which have recently emerged with the advent of deep learning. In particular, this thesis proposes a novel method for hand tracking in the presence of strong occlusions and clutter, the first method for full global 3D hand tracking from in-the-wild RGB video, and a method for simultaneous pose and dense shape reconstruction of two interacting hands that, for the first time, combines a set of desirable properties previously unseen in the literature.Hände sind einer der Hauptfaktoren für die Ausführung komplexer Aufgaben, und Menschen verwenden sie auf natürliche Weise für Interaktionen mit ihrer Umgebung. Die Rekonstruktion und Digitalisierung der 3D-Handbewegung eröffnet viele Möglichkeiten für wichtige Anwendungen. Handgesten können direkt als Eingabe für die Mensch-Computer-Interaktion verwendet werden. Dies ist insbesondere für Geräte der erweiterten oder virtuellen Realität (AR / VR) relevant, bei denen die Immersion von größter Bedeutung ist. Darüber hinaus ist die Rekonstruktion der 3D Handbewegung eine Voraussetzung zur automatischen Übersetzung von Gebärdensprache, zur Aktivitätserkennung oder zum Unterrichten von Robotern. In der Vergangenheit wurden verschiedene Ansätze zur 3D-Handbewegungsrekonstruktion aktiv erforscht. Handschuhe und physische Markierungen sind zwar präzise, aber aufdringlich und unangenehm zu tragen. Daher ist eine markierungslose Handrekonstruktion auf der Basis von Kameras wünschenswert. Multi-Kamera-Setups bieten umfangreiche Eingabedaten, sind jedoch schwer zu kalibrieren und haben keine Flexibilität für mobile Anwendungsfälle. Daher verwenden die meisten neueren Methoden eine einzelne Farb- oder Tiefenkamera, was die Aufgabe jedoch schwerer macht, da mehr Ambiguitäten in den Eingabedaten vorhanden sind. Für Interaktionszwecke benötigen Benutzer kontinuierliche Kontrolle und sofortiges Feedback. Dies bedeutet, dass die Algorithmen in Echtzeit ausgeführt werden müssen und robust in unkontrollierten Szenen sein müssen. Diese Anforderungen, 3D-Handrekonstruktion in Echtzeit mit einer einzigen Kamera in allgemeinen Szenen, machen das Problem erheblich schwieriger. Während neuere Forschungsarbeiten vielversprechende Ergebnisse gezeigt haben, weisen aktuelle Methoden immer noch Einschränkungen auf. Die meisten Ansätze verfolgen die Bewegung einer einzelnen Hand nur isoliert und berücksichtigen keine alltäglichen Umgebungen oder Interaktionen mit beliebigen Objekten oder der anderen Hand. Die wenigen Methoden, die allgemeinere und natürlichere Szenarien verarbeiten können, laufen nicht in Echtzeit oder verwenden komplexe Multi-Kamera-Setups. Solche Anforderungen machen bestehende Verfahren für viele der oben genannten Anwendungen unbrauchbar. Diese Dissertation erweitert den Stand der Technik für die Echtzeit-3D-Handverfolgung und -Rekonstruktion in allgemeinen Szenen mit einer einzelnen RGB- oder Tiefenkamera. Die vorgestellten Algorithmen erforschen neue Kombinationen aus generativen Handmodellen, die seit Jahrzehnten erfolgreich in den Bereichen Computer Vision und Grafik eingesetzt werden, und leistungsfähigen innovativen Techniken des maschinellen Lernens, die vor kurzem mit dem Aufkommen neuronaler Netzwerke entstanden sind. In dieser Arbeit werden insbesondere vorgeschlagen: eine neuartige Methode zur Handbewegungsrekonstruktion bei starken Verdeckungen und in unkontrollierten Szenen, die erste Methode zur Rekonstruktion der globalen 3D Handbewegung aus RGB-Videos in freier Wildbahn und die erste Methode zur gleichzeitigen Rekonstruktion von Handpose und -form zweier interagierender Hände, die eine Reihe wünschenwerter Eigenschaften komibiniert

    APPLICATION OF AUGMENTED REALITY IN MANUAL ASSEMBLY DESIGN AND PLANNING

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods
    corecore