519 research outputs found
Enabling Depth-driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives
The importance of depth perception in the interactions that humans have
within their nearby space is a well established fact. Consequently, it is also
well known that the possibility of exploiting good stereo information would
ease and, in many cases, enable, a large variety of attentional and interactive
behaviors on humanoid robotic platforms. However, the difficulty of computing
real-time and robust binocular disparity maps from moving stereo cameras often
prevents from relying on this kind of cue to visually guide robots' attention
and actions in real-world scenarios. The contribution of this paper is
two-fold: first, we show that the Efficient Large-scale Stereo Matching
algorithm (ELAS) by A. Geiger et al. 2010 for computation of the disparity map
is well suited to be used on a humanoid robotic platform as the iCub robot;
second, we show how, provided with a fast and reliable stereo system,
implementing relatively challenging visual behaviors in natural settings can
require much less effort. As a case of study we consider the common situation
where the robot is asked to focus the attention on one object close in the
scene, showing how a simple but effective disparity-based segmentation solves
the problem in this case. Indeed this example paves the way to a variety of
other similar applications
Biased Competition in Visual Processing Hierarchies: A Learning Approach Using Multiple Cues
In this contribution, we present a large-scale hierarchical system for object detection fusing bottom-up (signal-driven) processing results with top-down (model or task-driven) attentional modulation. Specifically, we focus on the question of how the autonomous learning of invariant models can be embedded into a performing system and how such models can be used to define object-specific attentional modulation signals. Our system implements bi-directional data flow in a processing hierarchy. The bottom-up data flow proceeds from a preprocessing level to the hypothesis level where object hypotheses created by exhaustive object detection algorithms are represented in a roughly retinotopic way. A competitive selection mechanism is used to determine the most confident hypotheses, which are used on the system level to train multimodal models that link object identity to invariant hypothesis properties. The top-down data flow originates at the system level, where the trained multimodal models are used to obtain space- and feature-based attentional modulation signals, providing biases for the competitive selection process at the hypothesis level. This results in object-specific hypothesis facilitation/suppression in certain image regions which we show to be applicable to different object detection mechanisms. In order to demonstrate the benefits of this approach, we apply the system to the detection of cars in a variety of challenging traffic videos. Evaluating our approach on a publicly available dataset containing approximately 3,500 annotated video images from more than 1Â h of driving, we can show strong increases in performance and generalization when compared to object detection in isolation. Furthermore, we compare our results to a late hypothesis rejection approach, showing that early coupling of top-down and bottom-up information is a favorable approach especially when processing resources are constrained
DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics
Robots are still limited to controlled conditions, that the robot designer
knows with enough details to endow the robot with the appropriate models or
behaviors. Learning algorithms add some flexibility with the ability to
discover the appropriate behavior given either some demonstrations or a reward
to guide its exploration with a reinforcement learning algorithm. Reinforcement
learning algorithms rely on the definition of state and action spaces that
define reachable behaviors. Their adaptation capability critically depends on
the representations of these spaces: small and discrete spaces result in fast
learning while large and continuous spaces are challenging and either require a
long training period or prevent the robot from converging to an appropriate
behavior. Beside the operational cycle of policy execution and the learning
cycle, which works at a slower time scale to acquire new policies, we introduce
the redescription cycle, a third cycle working at an even slower time scale to
generate or adapt the required representations to the robot, its environment
and the task. We introduce the challenges raised by this cycle and we present
DREAM (Deferred Restructuring of Experience in Autonomous Machines), a
developmental cognitive architecture to bootstrap this redescription process
stage by stage, build new state representations with appropriate motivations,
and transfer the acquired knowledge across domains or tasks or even across
robots. We describe results obtained so far with this approach and end up with
a discussion of the questions it raises in Neuroscience
Perceptual abstraction and attention
This is a report on the preliminary achievements of WP4 of the IM-CleVeR project on abstraction for cumulative learning, in particular directed to: (1) producing algorithms to develop abstraction features under top-down action influence; (2) algorithms for supporting detection of change in motion pictures; (3) developing attention and vergence control on the basis of locally computed rewards; (4) searching abstract representations suitable for the LCAS framework; (5) developing predictors based on information theory to support novelty detection. The report is organized around these 5 tasks that are part of WP4. We provide a synthetic description of the work done for each task by the partners
DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self
This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users
10302 Abstracts Collection -- Learning paradigms in dynamic environments
From 25.07. to 30.07.2010, the Dagstuhl Seminar 10302 ``Learning paradigms in dynamic environments \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Object detection and recognition with event driven cameras
This thesis presents study, analysis and implementation of algorithms
to perform object detection and recognition using an event-based cam
era. This sensor represents a novel paradigm which opens a wide range
of possibilities for future developments of computer vision. In partic
ular it allows to produce a fast, compressed, illumination invariant
output, which can be exploited for robotic tasks, where fast dynamics
and signi\ufb01cant illumination changes are frequent. The experiments
are carried out on the neuromorphic version of the iCub humanoid
platform. The robot is equipped with a novel dual camera setup
mounted directly in the robot\u2019s eyes, used to generate data with a
moving camera. The motion causes the presence of background clut
ter in the event stream.
In such scenario the detection problem has been addressed with an at
tention mechanism, speci\ufb01cally designed to respond to the presence of
objects, while discarding clutter. The proposed implementation takes
advantage of the nature of the data to simplify the original proto
object saliency model which inspired this work.
Successively, the recognition task was \ufb01rst tackled with a feasibility
study to demonstrate that the event stream carries su\ufb03cient informa
tion to classify objects and then with the implementation of a spiking
neural network. The feasibility study provides the proof-of-concept
that events are informative enough in the context of object classi\ufb01
cation, whereas the spiking implementation improves the results by
employing an architecture speci\ufb01cally designed to process event data.
The spiking network was trained with a three-factor local learning rule
which overcomes weight transport, update locking and non-locality
problem.
The presented results prove that both detection and classi\ufb01cation can
be carried-out in the target application using the event data
- âŠ