9,966 research outputs found
Fair comparison of skin detection approaches on publicly available datasets
Skin detection is the process of discriminating skin and non-skin regions in
a digital image and it is widely used in several applications ranging from hand
gesture analysis to track body parts and face detection. Skin detection is a
challenging problem which has drawn extensive attention from the research
community, nevertheless a fair comparison among approaches is very difficult
due to the lack of a common benchmark and a unified testing protocol. In this
work, we investigate the most recent researches in this field and we propose a
fair comparison among approaches using several different datasets. The major
contributions of this work are an exhaustive literature review of skin color
detection approaches, a framework to evaluate and combine different skin
detector approaches, whose source code is made freely available for future
research, and an extensive experimental comparison among several recent methods
which have also been used to define an ensemble that works well in many
different problems. Experiments are carried out in 10 different datasets
including more than 10000 labelled images: experimental results confirm that
the best method here proposed obtains a very good performance with respect to
other stand-alone approaches, without requiring ad hoc parameter tuning. A
MATLAB version of the framework for testing and of the methods proposed in this
paper will be freely available from https://github.com/LorisNann
Multi-Modal Human-Machine Communication for Instructing Robot Grasping Tasks
A major challenge for the realization of intelligent robots is to supply them
with cognitive abilities in order to allow ordinary users to program them
easily and intuitively. One way of such programming is teaching work tasks by
interactive demonstration. To make this effective and convenient for the user,
the machine must be capable to establish a common focus of attention and be
able to use and integrate spoken instructions, visual perceptions, and
non-verbal clues like gestural commands. We report progress in building a
hybrid architecture that combines statistical methods, neural networks, and
finite state machines into an integrated system for instructing grasping tasks
by man-machine interaction. The system combines the GRAVIS-robot for visual
attention and gestural instruction with an intelligent interface for speech
recognition and linguistic interpretation, and an modality fusion module to
allow multi-modal task-oriented man-machine communication with respect to
dextrous robot manipulation of objects.Comment: 7 pages, 8 figure
- …