3,946 research outputs found
The passive operating mode of the linear optical gesture sensor
The study evaluates the influence of natural light conditions on the
effectiveness of the linear optical gesture sensor, working in the presence of
ambient light only (passive mode). The orientations of the device in reference
to the light source were modified in order to verify the sensitivity of the
sensor. A criterion for the differentiation between two states: "possible
gesture" and "no gesture" was proposed. Additionally, different light
conditions and possible features were investigated, relevant for the decision
of switching between the passive and active modes of the device. The criterion
was evaluated based on the specificity and sensitivity analysis of the binary
ambient light condition classifier. The elaborated classifier predicts ambient
light conditions with the accuracy of 85.15%. Understanding the light
conditions, the hand pose can be detected. The achieved accuracy of the hand
poses classifier trained on the data obtained in the passive mode in favorable
light conditions was 98.76%. It was also shown that the passive operating mode
of the linear gesture sensor reduces the total energy consumption by 93.34%,
resulting in 0.132 mA. It was concluded that optical linear sensor could be
efficiently used in various lighting conditions.Comment: 10 pages, 14 figure
Multimodal Signal Processing and Learning Aspects of Human-Robot Interaction for an Assistive Bathing Robot
We explore new aspects of assistive living on smart human-robot interaction
(HRI) that involve automatic recognition and online validation of speech and
gestures in a natural interface, providing social features for HRI. We
introduce a whole framework and resources of a real-life scenario for elderly
subjects supported by an assistive bathing robot, addressing health and hygiene
care issues. We contribute a new dataset and a suite of tools used for data
acquisition and a state-of-the-art pipeline for multimodal learning within the
framework of the I-Support bathing robot, with emphasis on audio and RGB-D
visual streams. We consider privacy issues by evaluating the depth visual
stream along with the RGB, using Kinect sensors. The audio-gestural recognition
task on this new dataset yields up to 84.5%, while the online validation of the
I-Support system on elderly users accomplishes up to 84% when the two
modalities are fused together. The results are promising enough to support
further research in the area of multimodal recognition for assistive social
HRI, considering the difficulties of the specific task. Upon acceptance of the
paper part of the data will be publicly available
Hand gesture recognition with jointly calibrated Leap Motion and depth sensor
Novel 3D acquisition devices like depth cameras and the Leap Motion have recently reached the market. Depth cameras allow to obtain a complete 3D description of the framed scene while the Leap Motion sensor is a device explicitly targeted for hand gesture recognition and provides only a limited set of relevant points. This paper shows how to jointly exploit the two types of sensors for accurate gesture recognition. An ad-hoc solution for the joint calibration of the two devices is firstly presented. Then a set of novel feature descriptors is introduced both for the Leap Motion and for depth data. Various schemes based on the distances of the hand samples from the centroid, on the curvature of the hand contour and on the convex hull of the hand shape are employed and the use of Leap Motion data to aid feature extraction is also considered. The proposed feature sets are fed to two different classifiers, one based on multi-class SVMs and one exploiting Random Forests. Different feature selection algorithms have also been tested in order to reduce the complexity of the approach. Experimental results show that a very high accuracy can be obtained from the proposed method. The current implementation is also able to run in real-time
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
- …