29,716 research outputs found
Covert Perceptual Capability Development
In this paper, we propose a model to develop
robotsâ covert perceptual capability using reinforcement learning. Covert perceptual behavior is treated as action selected by a motivational system. We apply this model to
vision-based navigation. The goal is to enable
a robot to learn road boundary type. Instead
of dealing with problems in controlled environments with a low-dimensional state space,
we test the model on images captured in non-stationary environments. Incremental Hierarchical Discriminant Regression is used to
generate states on the fly. Its coarse-to-fine
tree structure guarantees real-time retrieval
in high-dimensional state space. K Nearest-Neighbor strategy is adopted to further reduce training time complexity
SOVEREIGN: A Self-Organizing, Vision, Expectation, Recognition, Emotion, Intelligent, Goal-Oriented Navigation System
Both animals and mobile robots, or animats, need adaptive control systems to guide their movements through a novel environment. Such control systems need reactive mechanisms for exploration, and learned plans to efficiently reach goal objects once the environment is familiar. How reactive and planned behaviors interact together in real time, and arc released at the appropriate times, during autonomous navigation remains a major unsolved problern. This work presents an end-to-end model to address this problem, named SOVEREIGN: A Self-Organizing, Vision, Expectation, Recognition, Emotion, Intelligent, Goal-oriented Navigation system. The model comprises several interacting subsystems, governed by systems of nonlinear differential equations. As the animat explores the environment, a vision module processes visual inputs using networks that arc sensitive to visual form and motion. Targets processed within the visual form system arc categorized by real-time incremental learning. Simultaneously, visual target position is computed with respect to the animat's body. Estimates of target position activate a motor system to initiate approach movements toward the target. Motion cues from animat locomotion can elicit orienting head or camera movements to bring a never target into view. Approach and orienting movements arc alternately performed during animat navigation. Cumulative estimates of each movement, based on both visual and proprioceptive cues, arc stored within a motor working memory. Sensory cues are stored in a parallel sensory working memory. These working memories trigger learning of sensory and motor sequence chunks, which together control planned movements. Effective chunk combinations arc selectively enhanced via reinforcement learning when the animat is rewarded. The planning chunks effect a gradual transition from reactive to planned behavior. The model can read-out different motor sequences under different motivational states and learns more efficient paths to rewarded goals as exploration proceeds. Several volitional signals automatically gate the interactions between model subsystems at appropriate times. A 3-D visual simulation environment reproduces the animat's sensory experiences as it moves through a simplified spatial environment. The SOVEREIGN model exhibits robust goal-oriented learning of sequential motor behaviors. Its biomimctic structure explicates a number of brain processes which are involved in spatial navigation.Advanced Research Projects Agency (N00014-92-J-4015); Air Force Office of Scientific Research (F49620-92-J-0225, F49620-01-1-0397); National Science Foundation (IRI 90-24877, SBE-0354378); Office of Naval Research (N00014-91-J-4100, N00014-92-J-1309, N00014-95-1-0657, N00014-01-1-0624); Pacific Sierra Research (PSR 91-6075-2
Developmental Robots - A New Paradigm
It has been proved to be extremely challenging for humans to program a robot to such a sufficient degree that it acts properly in a typical unknown human environment. This is especially true for a humanoid robot due to the very large number of redundant degrees of freedom and a large number of sensors that are required for a humanoid to work safely and effectively in the human environment. How can we address this fundamental problem? Motivated by human mental development from infancy to adulthood, we present a theory, an architecture, and some experimental results showing how to enable a robot to develop its mind automatically, through online, real time interactions with its environment. Humans mentally âraiseâ the robot through ârobot sittingâ and ârobot schoolsâ instead of task-specific robot programming
The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping
Many tasks performed by autonomous vehicles such as road marking detection,
object tracking, and path planning are simpler in bird's-eye view. Hence,
Inverse Perspective Mapping (IPM) is often applied to remove the perspective
effect from a vehicle's front-facing camera and to remap its images into a 2D
domain, resulting in a top-down view. Unfortunately, however, this leads to
unnatural blurring and stretching of objects at further distance, due to the
resolution of the camera, limiting applicability. In this paper, we present an
adversarial learning approach for generating a significantly improved IPM from
a single camera image in real time. The generated bird's-eye-view images
contain sharper features (e.g. road markings) and a more homogeneous
illumination, while (dynamic) objects are automatically removed from the scene,
thus revealing the underlying road layout in an improved fashion. We
demonstrate our framework using real-world data from the Oxford RobotCar
Dataset and show that scene understanding tasks directly benefit from our
boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures,
accepted at IV 201
Exploiting visual salience for the generation of referring expressions
In this paper we present a novel approach to generating
referring expressions (GRE) that is tailored to a model of the visual context the user is attending to. The approach
integrates a new computational model of visual salience in simulated 3-D environments with Dale and Reiterâs (1995) Incremental Algorithm. The advantage of our GRE framework are: (1) the context set used by the GRE algorithm is dynamically computed by the visual saliency algorithm as a user navigates through a simulation; (2) the integration of visual salience into the generation process means that in some instances underspecified but sufficiently detailed descriptions of the target object are generated that are shorter than those generated by GRE algorithms which focus purely on adjectival and type attributes; (3) the integration of visual saliency into the generation process means that our GRE algorithm will in some instances succeed in generating a description of the target object in situations where GRE algorithms which focus purely on adjectival and type attributes fail
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Mesh-based 3D Textured Urban Mapping
In the era of autonomous driving, urban mapping represents a core step to let
vehicles interact with the urban context. Successful mapping algorithms have
been proposed in the last decade building the map leveraging on data from a
single sensor. The focus of the system presented in this paper is twofold: the
joint estimation of a 3D map from lidar data and images, based on a 3D mesh,
and its texturing. Indeed, even if most surveying vehicles for mapping are
endowed by cameras and lidar, existing mapping algorithms usually rely on
either images or lidar data; moreover both image-based and lidar-based systems
often represent the map as a point cloud, while a continuous textured mesh
representation would be useful for visualization and navigation purposes. In
the proposed framework, we join the accuracy of the 3D lidar data, and the
dense information and appearance carried by the images, in estimating a
visibility consistent map upon the lidar measurements, and refining it
photometrically through the acquired images. We evaluate the proposed framework
against the KITTI dataset and we show the performance improvement with respect
to two state of the art urban mapping algorithms, and two widely used surface
reconstruction algorithms in Computer Graphics.Comment: accepted at iros 201
Characterization of image sets: the Galois Lattice approach
This paper presents a new method for supervised image
classification. One or several landmarks are attached to each class, with the intention of characterizing it and discriminating it from the other classes. The different features, deduced from image primitives, and their relationships with the sets of images are structured and organized into a hierarchy thanks to an original method relying on a mathematical formalism called Galois (or Concept) Lattices. Such lattices allow us to select features as landmarks of specific classes. This paper details the feature selection process and illustrates this through a robotic example in a structured environment. The class of any image is the room from which the image is shot by the robot camera. In the discussion, we compare this approach with decision trees and we give some issues for future research
- âŚ