9,014 research outputs found

    ModDrop: adaptive multi-modal gesture recognition

    Full text link
    We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales. Key to our technique is a training strategy which exploits: i) careful initialization of individual modalities; and ii) gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams. Fusing multiple modalities at several spatial and temporal scales leads to a significant increase in recognition rates, allowing the model to compensate for errors of the individual classifiers as well as noise in the separate channels. Futhermore, the proposed ModDrop training technique ensures robustness of the classifier to missing signals in one or several channels to produce meaningful predictions from any number of available modalities. In addition, we demonstrate the applicability of the proposed fusion scheme to modalities of arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure

    EMI Spy: Harnessing electromagnetic interference for low-cost, rapid prototyping of proxemic interaction

    Get PDF
    We present a wearable system that uses ambient electromagnetic interference (EMI) as a signature to identify electronic devices and support proxemic interaction. We designed a low cost tool, called EMI Spy, and a software environment for rapid deployment and evaluation of ambient EMI-based interactive infrastructure. EMI Spy captures electromagnetic interference and delivers the signal to a user's mobile device or PC through either the device's wired audio input or wirelessly using Bluetooth. The wireless version can be worn on the wrist, communicating with the user;s mobile device in their pocket. Users are able to train the system in less than 1 second to uniquely identify displays in a 2-m radius around them, as well as to detect pointing at a distance and touching gestures on the displays in real-time. The combination of a low cost EMI logger and an open source machine learning tool kit allows developers to quickly prototype proxemic, touch-to-connect, and gestural interaction. We demonstrate the feasibility of mobile, EMI-based device and gesture recognition with preliminary user studies in 3 scenarios, achieving 96% classification accuracy at close range for 6 digital signage displays distributed throughout a building, and 90% accuracy in classifying pointing gestures at neighboring desktop LCD displays. We were able to distinguish 1- and 2-finger touching with perfect accuracy and show indications of a way to determine power consumption of a device via touch. Our system is particularly well-suited to temporary use in a public space, where the sensors could be distributed to support a popup interactive environment anywhere with electronic devices. By designing for low cost, mobile, flexible, and infrastructure-free deployment, we aim to enable a host of new proxemic interfaces to existing appliances and displays

    A Decoupled 3D Facial Shape Model by Adversarial Training

    Get PDF
    Data-driven generative 3D face models are used to compactly encode facial shape data into meaningful parametric representations. A desirable property of these models is their ability to effectively decouple natural sources of variation, in particular identity and expression. While factorized representations have been proposed for that purpose, they are still limited in the variability they can capture and may present modeling artifacts when applied to tasks such as expression transfer. In this work, we explore a new direction with Generative Adversarial Networks and show that they contribute to better face modeling performances, especially in decoupling natural factors, while also achieving more diverse samples. To train the model we introduce a novel architecture that combines a 3D generator with a 2D discriminator that leverages conventional CNNs, where the two components are bridged by a geometry mapping layer. We further present a training scheme, based on auxiliary classifiers, to explicitly disentangle identity and expression attributes. Through quantitative and qualitative results on standard face datasets, we illustrate the benefits of our model and demonstrate that it outperforms competing state of the art methods in terms of decoupling and diversity.Comment: camera-ready version for ICCV'1

    Design of a CMOS Analog Front-End for Wearable A-Mode Ultrasound Hand Gesture Recognition

    Get PDF
    This paper presents a CMOS analog front-end for wearable A-mode ultrasound hand gesture recognition. This analog front-end is part of the research into using ultrasound to record and decode muscle signals with the aim of controlling a prosthetic hand as an alternative to surface electromyography. In this paper, the design of a pulser for driving piezoelectric transducers as well as a low-noise amplifier for the received echoes are presented. Simulation results show that the pulser circuit is capable of driving a 137 pF capacitive load with 30 V pulses at a frequency of 1 MHz and dissipates 142.1 mW power. The low-noise amplifier demonstrates a gain of 34 dB and an input-referred noise of 8.58 nV/√Hz at 1 MHz

    The Future of Humanoid Robots

    Get PDF
    This book provides state of the art scientific and engineering research findings and developments in the field of humanoid robotics and its applications. It is expected that humanoids will change the way we interact with machines, and will have the ability to blend perfectly into an environment already designed for humans. The book contains chapters that aim to discover the future abilities of humanoid robots by presenting a variety of integrated research in various scientific and engineering fields, such as locomotion, perception, adaptive behavior, human-robot interaction, neuroscience and machine learning. The book is designed to be accessible and practical, with an emphasis on useful information to those working in the fields of robotics, cognitive science, artificial intelligence, computational methods and other fields of science directly or indirectly related to the development and usage of future humanoid robots. The editor of the book has extensive R&D experience, patents, and publications in the area of humanoid robotics, and his experience is reflected in editing the content of the book
    • …
    corecore