3,745 research outputs found

    Hand gesture recognition with jointly calibrated Leap Motion and depth sensor

    Get PDF
    Novel 3D acquisition devices like depth cameras and the Leap Motion have recently reached the market. Depth cameras allow to obtain a complete 3D description of the framed scene while the Leap Motion sensor is a device explicitly targeted for hand gesture recognition and provides only a limited set of relevant points. This paper shows how to jointly exploit the two types of sensors for accurate gesture recognition. An ad-hoc solution for the joint calibration of the two devices is firstly presented. Then a set of novel feature descriptors is introduced both for the Leap Motion and for depth data. Various schemes based on the distances of the hand samples from the centroid, on the curvature of the hand contour and on the convex hull of the hand shape are employed and the use of Leap Motion data to aid feature extraction is also considered. The proposed feature sets are fed to two different classifiers, one based on multi-class SVMs and one exploiting Random Forests. Different feature selection algorithms have also been tested in order to reduce the complexity of the approach. Experimental results show that a very high accuracy can be obtained from the proposed method. The current implementation is also able to run in real-time

    A Conceptual Framework for Motion Based Music Applications

    Get PDF
    Imaginary projections are the core of the framework for motion based music applications presented in this paper. Their design depends on the space covered by the motion tracking device, but also on the musical feature involved in the application. They can be considered a very powerful tool because they allow not only to project in the virtual environment the image of a traditional acoustic instrument, but also to express any spatially defined abstract concept. The system pipeline starts from the musical content and, through a geometrical interpretation, arrives to its projection in the physical space. Three case studies involving different motion tracking devices and different musical concepts will be analyzed. The three examined applications have been programmed and already tested by the authors. They aim respectively at musical expressive interaction (Disembodied Voices), tonal music knowledge (Harmonic Walk) and XX century music composition (Hand Composer)

    Prefrontal cortex activation upon a demanding virtual hand-controlled task: A new frontier for neuroergonomics

    Get PDF
    open9noFunctional near-infrared spectroscopy (fNIRS) is a non-invasive vascular-based functional neuroimaging technology that can assess, simultaneously from multiple cortical areas, concentration changes in oxygenated-deoxygenated hemoglobin at the level of the cortical microcirculation blood vessels. fNIRS, with its high degree of ecological validity and its very limited requirement of physical constraints to subjects, could represent a valid tool for monitoring cortical responses in the research field of neuroergonomics. In virtual reality (VR) real situations can be replicated with greater control than those obtainable in the real world. Therefore, VR is the ideal setting where studies about neuroergonomics applications can be performed. The aim of the present study was to investigate, by a 20-channel fNIRS system, the dorsolateral/ventrolateral prefrontal cortex (DLPFC/VLPFC) in subjects while performing a demanding VR hand-controlled task (HCT). Considering the complexity of the HCT, its execution should require the attentional resources allocation and the integration of different executive functions. The HCT simulates the interaction with a real, remotely-driven, system operating in a critical environment. The hand movements were captured by a high spatial and temporal resolution 3-dimensional (3D) hand-sensing device, the LEAP motion controller, a gesture-based control interface that could be used in VR for tele-operated applications. Fifteen University students were asked to guide, with their right hand/forearm, a virtual ball (VB) over a virtual route (VROU) reproducing a 42 m narrow road including some critical points. The subjects tried to travel as long as possible without making VB fall. The distance traveled by the guided VB was 70.2 ± 37.2 m. The less skilled subjects failed several times in guiding the VB over the VROU. Nevertheless, a bilateral VLPFC activation, in response to the HCT execution, was observed in all the subjects. No correlation was found between the distance traveled by the guided VB and the corresponding cortical activation. These results confirm the suitability of fNIRS technology to objectively evaluate cortical hemodynamic changes occurring in VR environments. Future studies could give a contribution to a better understanding of the cognitive mechanisms underlying human performance either in expert or non-expert operators during the simulation of different demanding/fatiguing activities.openCarrieri, Marika; Petracca, Andrea; Lancia, Stefania; Basso Moro, Sara; Brigadoi, Sabrina; Spezialetti, Matteo; Ferrari, Marco; Placidi, Giuseppe; Quaresima, ValentinaCarrieri, Marika; Petracca, Andrea; Lancia, Stefania; BASSO MORO, Sara; Brigadoi, Sabrina; Spezialetti, Matteo; Ferrari, Marco; Placidi, Giuseppe; Quaresima, Valentin

    Novel Multimodal Feedback Techniques for In-Car Mid-Air Gesture Interaction

    Get PDF
    This paper presents an investigation into the effects of different feedback modalities on mid-air gesture interaction for infotainment systems in cars. Car crashes and near-crash events are most commonly caused by driver distraction. Mid-air interaction is a way of reducing driver distraction by reducing visual demand from infotainment. Despite a range of available modalities, feedback in mid-air gesture systems is generally provided through visual displays. We conducted a simulated driving study to investigate how different types of multimodal feedback can support in-air gestures. The effects of different feedback modalities on eye gaze behaviour, and the driving and gesturing tasks are considered. We found that feedback modality influenced gesturing behaviour. However, drivers corrected falsely executed gestures more often in non-visual conditions. Our findings show that non-visual feedback can reduce visual distraction significantl

    Semantic framework for interactive animation generation and its application in virtual shadow play performance.

    Get PDF
    Designing and creating complex and interactive animation is still a challenge in the field of virtual reality, which has to handle various aspects of functional requirements (e.g. graphics, physics, AI, multimodal inputs and outputs, and massive data assets management). In this paper, a semantic framework is proposed to model the construction of interactive animation and promote animation assets reuse in a systematic and standardized way. As its ontological implementation, two domain specific ontologies for the hand-gesture-based interaction and animation data repository have been developed in the context of Chinese traditional shadow play art. Finally, prototype of interactive Chinese shadow play performance system using deep motion sensor device is presented as the usage example

    DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation

    Full text link
    There is an undeniable communication barrier between deaf people and people with normal hearing ability. Although innovations in sign language translation technology aim to tear down this communication barrier, the majority of existing sign language translation systems are either intrusive or constrained by resolution or ambient lighting conditions. Moreover, these existing systems can only perform single-sign ASL translation rather than sentence-level translation, making them much less useful in daily-life communication scenarios. In this work, we fill this critical gap by presenting DeepASL, a transformative deep learning-based sign language translation technology that enables ubiquitous and non-intrusive American Sign Language (ASL) translation at both word and sentence levels. DeepASL uses infrared light as its sensing mechanism to non-intrusively capture the ASL signs. It incorporates a novel hierarchical bidirectional deep recurrent neural network (HB-RNN) and a probabilistic framework based on Connectionist Temporal Classification (CTC) for word-level and sentence-level ASL translation respectively. To evaluate its performance, we have collected 7,306 samples from 11 participants, covering 56 commonly used ASL words and 100 ASL sentences. DeepASL achieves an average 94.5% word-level translation accuracy and an average 8.2% word error rate on translating unseen ASL sentences. Given its promising performance, we believe DeepASL represents a significant step towards breaking the communication barrier between deaf people and hearing majority, and thus has the significant potential to fundamentally change deaf people's lives
    • …
    corecore