25,453 research outputs found
Multi-View Picking: Next-best-view Reaching for Improved Grasping in Clutter
Camera viewpoint selection is an important aspect of visual grasp detection,
especially in clutter where many occlusions are present. Where other approaches
use a static camera position or fixed data collection routines, our Multi-View
Picking (MVP) controller uses an active perception approach to choose
informative viewpoints based directly on a distribution of grasp pose estimates
in real time, reducing uncertainty in the grasp poses caused by clutter and
occlusions. In trials of grasping 20 objects from clutter, our MVP controller
achieves 80% grasp success, outperforming a single-viewpoint grasp detector by
12%. We also show that our approach is both more accurate and more efficient
than approaches which consider multiple fixed viewpoints.Comment: ICRA 2019 Video: https://youtu.be/Vn3vSPKlaEk Code:
https://github.com/dougsm/mvp_gras
Characteristics of flight simulator visual systems
The physical parameters of the flight simulator visual system that characterize the system and determine its fidelity are identified and defined. The characteristics of visual simulation systems are discussed in terms of the basic categories of spatial, energy, and temporal properties corresponding to the three fundamental quantities of length, mass, and time. Each of these parameters are further addressed in relation to its effect, its appropriate units or descriptors, methods of measurement, and its use or importance to image quality
Dynamic Occupancy Grid Prediction for Urban Autonomous Driving: A Deep Learning Approach with Fully Automatic Labeling
Long-term situation prediction plays a crucial role in the development of
intelligent vehicles. A major challenge still to overcome is the prediction of
complex downtown scenarios with multiple road users, e.g., pedestrians, bikes,
and motor vehicles, interacting with each other. This contribution tackles this
challenge by combining a Bayesian filtering technique for environment
representation, and machine learning as long-term predictor. More specifically,
a dynamic occupancy grid map is utilized as input to a deep convolutional
neural network. This yields the advantage of using spatially distributed
velocity estimates from a single time step for prediction, rather than a raw
data sequence, alleviating common problems dealing with input time series of
multiple sensors. Furthermore, convolutional neural networks have the inherent
characteristic of using context information, enabling the implicit modeling of
road user interaction. Pixel-wise balancing is applied in the loss function
counteracting the extreme imbalance between static and dynamic cells. One of
the major advantages is the unsupervised learning character due to fully
automatic label generation. The presented algorithm is trained and evaluated on
multiple hours of recorded sensor data and compared to Monte-Carlo simulation
Homography-based ground plane detection using a single on-board camera
This study presents a robust method for ground plane detection in vision-based systems with a non-stationary camera. The proposed method is based on the reliable estimation of the homography between ground planes in successive images. This homography is computed using a feature matching approach, which in contrast to classical approaches to on-board motion estimation does not require explicit ego-motion calculation. As opposed to it, a novel homography calculation method based on a linear estimation framework is presented. This framework provides predictions of the ground plane transformation matrix that are dynamically updated with new measurements. The method is specially suited for challenging environments, in particular traffic scenarios, in which the information is scarce and the homography computed from the images is usually inaccurate or erroneous. The proposed estimation framework is able to remove erroneous measurements and to correct those that are inaccurate, hence producing a reliable homography estimate at each instant. It is based on the evaluation of the difference between the predicted and the observed transformations, measured according to the spectral norm of the associated matrix of differences. Moreover, an example is provided on how to use the information extracted from ground plane estimation to achieve object detection and tracking. The method has been successfully demonstrated for the detection of moving vehicles in traffic environments
A Neural Model of Visually Guided Steering, Obstacle Avoidance, and Route Selection
A neural model is developed to explain how humans can approach a goal object on foot while steering around obstacles to avoid collisions in a cluttered environment. The model uses optic flow from a 3D virtual reality environment to determine the position of objects based on motion discotinuities, and computes heading direction, or the direction of self-motion, from global optic flow. The cortical representation of heading interacts with the representations of a goal and obstacles such that the goal acts as an attractor of heading, while obstacles act as repellers. In addition the model maintains fixation on the goal object by generating smooth pursuit eye movements. Eye rotations can distort the optic flow field, complicating heading perception, and the model uses extraretinal signals to correct for this distortion and accurately represent heading. The model explains how motion processing mechanisms in cortical areas MT, MST, and VIP can be used to guide steering. The model quantitatively simulates human psychophysical data about visually-guided steering, obstacle avoidance, and route selection.Air Force Office of Scientific Research (F4960-01-1-0397); National Geospatial-Intelligence Agency (NMA201-01-1-2016); National Science Foundation (NSF SBE-0354378); Office of Naval Research (N00014-01-1-0624
Probabilistic Motion Estimation Based on Temporal Coherence
We develop a theory for the temporal integration of visual motion motivated
by psychophysical experiments. The theory proposes that input data are
temporally grouped and used to predict and estimate the motion flows in the
image sequence. This temporal grouping can be considered a generalization of
the data association techniques used by engineers to study motion sequences.
Our temporal-grouping theory is expressed in terms of the Bayesian
generalization of standard Kalman filtering. To implement the theory we derive
a parallel network which shares some properties of cortical networks. Computer
simulations of this network demonstrate that our theory qualitatively accounts
for psychophysical experiments on motion occlusion and motion outliers.Comment: 40 pages, 7 figure
- ā¦