519 research outputs found

    Enabling Depth-driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives

    Get PDF
    The importance of depth perception in the interactions that humans have within their nearby space is a well established fact. Consequently, it is also well known that the possibility of exploiting good stereo information would ease and, in many cases, enable, a large variety of attentional and interactive behaviors on humanoid robotic platforms. However, the difficulty of computing real-time and robust binocular disparity maps from moving stereo cameras often prevents from relying on this kind of cue to visually guide robots' attention and actions in real-world scenarios. The contribution of this paper is two-fold: first, we show that the Efficient Large-scale Stereo Matching algorithm (ELAS) by A. Geiger et al. 2010 for computation of the disparity map is well suited to be used on a humanoid robotic platform as the iCub robot; second, we show how, provided with a fast and reliable stereo system, implementing relatively challenging visual behaviors in natural settings can require much less effort. As a case of study we consider the common situation where the robot is asked to focus the attention on one object close in the scene, showing how a simple but effective disparity-based segmentation solves the problem in this case. Indeed this example paves the way to a variety of other similar applications

    Biased Competition in Visual Processing Hierarchies: A Learning Approach Using Multiple Cues

    Get PDF
    In this contribution, we present a large-scale hierarchical system for object detection fusing bottom-up (signal-driven) processing results with top-down (model or task-driven) attentional modulation. Specifically, we focus on the question of how the autonomous learning of invariant models can be embedded into a performing system and how such models can be used to define object-specific attentional modulation signals. Our system implements bi-directional data flow in a processing hierarchy. The bottom-up data flow proceeds from a preprocessing level to the hypothesis level where object hypotheses created by exhaustive object detection algorithms are represented in a roughly retinotopic way. A competitive selection mechanism is used to determine the most confident hypotheses, which are used on the system level to train multimodal models that link object identity to invariant hypothesis properties. The top-down data flow originates at the system level, where the trained multimodal models are used to obtain space- and feature-based attentional modulation signals, providing biases for the competitive selection process at the hypothesis level. This results in object-specific hypothesis facilitation/suppression in certain image regions which we show to be applicable to different object detection mechanisms. In order to demonstrate the benefits of this approach, we apply the system to the detection of cars in a variety of challenging traffic videos. Evaluating our approach on a publicly available dataset containing approximately 3,500 annotated video images from more than 1 h of driving, we can show strong increases in performance and generalization when compared to object detection in isolation. Furthermore, we compare our results to a late hypothesis rejection approach, showing that early coupling of top-down and bottom-up information is a favorable approach especially when processing resources are constrained

    DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics

    Full text link
    Robots are still limited to controlled conditions, that the robot designer knows with enough details to endow the robot with the appropriate models or behaviors. Learning algorithms add some flexibility with the ability to discover the appropriate behavior given either some demonstrations or a reward to guide its exploration with a reinforcement learning algorithm. Reinforcement learning algorithms rely on the definition of state and action spaces that define reachable behaviors. Their adaptation capability critically depends on the representations of these spaces: small and discrete spaces result in fast learning while large and continuous spaces are challenging and either require a long training period or prevent the robot from converging to an appropriate behavior. Beside the operational cycle of policy execution and the learning cycle, which works at a slower time scale to acquire new policies, we introduce the redescription cycle, a third cycle working at an even slower time scale to generate or adapt the required representations to the robot, its environment and the task. We introduce the challenges raised by this cycle and we present DREAM (Deferred Restructuring of Experience in Autonomous Machines), a developmental cognitive architecture to bootstrap this redescription process stage by stage, build new state representations with appropriate motivations, and transfer the acquired knowledge across domains or tasks or even across robots. We describe results obtained so far with this approach and end up with a discussion of the questions it raises in Neuroscience

    Perceptual abstraction and attention

    Get PDF
    This is a report on the preliminary achievements of WP4 of the IM-CleVeR project on abstraction for cumulative learning, in particular directed to: (1) producing algorithms to develop abstraction features under top-down action influence; (2) algorithms for supporting detection of change in motion pictures; (3) developing attention and vergence control on the basis of locally computed rewards; (4) searching abstract representations suitable for the LCAS framework; (5) developing predictors based on information theory to support novelty detection. The report is organized around these 5 tasks that are part of WP4. We provide a synthetic description of the work done for each task by the partners

    DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self

    Get PDF
    This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users

    10302 Abstracts Collection -- Learning paradigms in dynamic environments

    Get PDF
    From 25.07. to 30.07.2010, the Dagstuhl Seminar 10302 ``Learning paradigms in dynamic environments \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Object detection and recognition with event driven cameras

    Get PDF
    This thesis presents study, analysis and implementation of algorithms to perform object detection and recognition using an event-based cam era. This sensor represents a novel paradigm which opens a wide range of possibilities for future developments of computer vision. In partic ular it allows to produce a fast, compressed, illumination invariant output, which can be exploited for robotic tasks, where fast dynamics and signi\ufb01cant illumination changes are frequent. The experiments are carried out on the neuromorphic version of the iCub humanoid platform. The robot is equipped with a novel dual camera setup mounted directly in the robot\u2019s eyes, used to generate data with a moving camera. The motion causes the presence of background clut ter in the event stream. In such scenario the detection problem has been addressed with an at tention mechanism, speci\ufb01cally designed to respond to the presence of objects, while discarding clutter. The proposed implementation takes advantage of the nature of the data to simplify the original proto object saliency model which inspired this work. Successively, the recognition task was \ufb01rst tackled with a feasibility study to demonstrate that the event stream carries su\ufb03cient informa tion to classify objects and then with the implementation of a spiking neural network. The feasibility study provides the proof-of-concept that events are informative enough in the context of object classi\ufb01 cation, whereas the spiking implementation improves the results by employing an architecture speci\ufb01cally designed to process event data. The spiking network was trained with a three-factor local learning rule which overcomes weight transport, update locking and non-locality problem. The presented results prove that both detection and classi\ufb01cation can be carried-out in the target application using the event data
    • 

    corecore