9,014 research outputs found
ModDrop: adaptive multi-modal gesture recognition
We present a method for gesture detection and localisation based on
multi-scale and multi-modal deep learning. Each visual modality captures
spatial information at a particular spatial scale (such as motion of the upper
body or a hand), and the whole system operates at three temporal scales. Key to
our technique is a training strategy which exploits: i) careful initialization
of individual modalities; and ii) gradual fusion involving random dropping of
separate channels (dubbed ModDrop) for learning cross-modality correlations
while preserving uniqueness of each modality-specific representation. We
present experiments on the ChaLearn 2014 Looking at People Challenge gesture
recognition track, in which we placed first out of 17 teams. Fusing multiple
modalities at several spatial and temporal scales leads to a significant
increase in recognition rates, allowing the model to compensate for errors of
the individual classifiers as well as noise in the separate channels.
Futhermore, the proposed ModDrop training technique ensures robustness of the
classifier to missing signals in one or several channels to produce meaningful
predictions from any number of available modalities. In addition, we
demonstrate the applicability of the proposed fusion scheme to modalities of
arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure
EMI Spy: Harnessing electromagnetic interference for low-cost, rapid prototyping of proxemic interaction
We present a wearable system that uses ambient electromagnetic interference (EMI) as a signature to identify electronic devices and support proxemic interaction. We designed a low cost tool, called EMI Spy, and a software environment for rapid deployment and evaluation of ambient EMI-based interactive infrastructure. EMI Spy captures electromagnetic interference and delivers the signal to a user's mobile device or PC through either the device's wired audio input or wirelessly using Bluetooth. The wireless version can be worn on the wrist, communicating with the user;s mobile device in their pocket. Users are able to train the system in less than 1 second to uniquely identify displays in a 2-m radius around them, as well as to detect pointing at a distance and touching gestures on the displays in real-time. The combination of a low cost EMI logger and an open source machine learning tool kit allows developers to quickly prototype proxemic, touch-to-connect, and gestural interaction. We demonstrate the feasibility of mobile, EMI-based device and gesture recognition with preliminary user studies in 3 scenarios, achieving 96% classification accuracy at close range for 6 digital signage displays distributed throughout a building, and 90% accuracy in classifying pointing gestures at neighboring desktop LCD displays. We were able to distinguish 1- and 2-finger touching with perfect accuracy and show indications of a way to determine power consumption of a device via touch. Our system is particularly well-suited to temporary use in a public space, where the sensors could be distributed to support a popup interactive environment anywhere with electronic devices. By designing for low cost, mobile, flexible, and infrastructure-free deployment, we aim to enable a host of new proxemic interfaces to existing appliances and displays
A Decoupled 3D Facial Shape Model by Adversarial Training
Data-driven generative 3D face models are used to compactly encode facial
shape data into meaningful parametric representations. A desirable property of
these models is their ability to effectively decouple natural sources of
variation, in particular identity and expression. While factorized
representations have been proposed for that purpose, they are still limited in
the variability they can capture and may present modeling artifacts when
applied to tasks such as expression transfer. In this work, we explore a new
direction with Generative Adversarial Networks and show that they contribute to
better face modeling performances, especially in decoupling natural factors,
while also achieving more diverse samples. To train the model we introduce a
novel architecture that combines a 3D generator with a 2D discriminator that
leverages conventional CNNs, where the two components are bridged by a geometry
mapping layer. We further present a training scheme, based on auxiliary
classifiers, to explicitly disentangle identity and expression attributes.
Through quantitative and qualitative results on standard face datasets, we
illustrate the benefits of our model and demonstrate that it outperforms
competing state of the art methods in terms of decoupling and diversity.Comment: camera-ready version for ICCV'1
Design of a CMOS Analog Front-End for Wearable A-Mode Ultrasound Hand Gesture Recognition
This paper presents a CMOS analog front-end for wearable A-mode ultrasound hand gesture recognition. This analog front-end is part of the research into using ultrasound to record and decode muscle signals with the aim of controlling a prosthetic hand as an alternative to surface electromyography. In this paper, the design of a pulser for driving piezoelectric transducers as well as a low-noise amplifier for the received echoes are presented. Simulation results show that the pulser circuit is capable of driving a 137 pF capacitive load with 30 V pulses at a frequency of 1 MHz and dissipates 142.1 mW power. The low-noise amplifier demonstrates a gain of 34 dB and an input-referred noise of 8.58 nV/√Hz at 1 MHz
The Future of Humanoid Robots
This book provides state of the art scientific and engineering research findings and developments in the field of humanoid robotics and its applications. It is expected that humanoids will change the way we interact with machines, and will have the ability to blend perfectly into an environment already designed for humans. The book contains chapters that aim to discover the future abilities of humanoid robots by presenting a variety of integrated research in various scientific and engineering fields, such as locomotion, perception, adaptive behavior, human-robot interaction, neuroscience and machine learning. The book is designed to be accessible and practical, with an emphasis on useful information to those working in the fields of robotics, cognitive science, artificial intelligence, computational methods and other fields of science directly or indirectly related to the development and usage of future humanoid robots. The editor of the book has extensive R&D experience, patents, and publications in the area of humanoid robotics, and his experience is reflected in editing the content of the book
- …