116,757 research outputs found
3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection
Cameras are a crucial exteroceptive sensor for self-driving cars as they are
low-cost and small, provide appearance information about the environment, and
work in various weather conditions. They can be used for multiple purposes such
as visual navigation and obstacle detection. We can use a surround multi-camera
system to cover the full 360-degree field-of-view around the car. In this way,
we avoid blind spots which can otherwise lead to accidents. To minimize the
number of cameras needed for surround perception, we utilize fisheye cameras.
Consequently, standard vision pipelines for 3D mapping, visual localization,
obstacle detection, etc. need to be adapted to take full advantage of the
availability of multiple cameras rather than treat each camera individually. In
addition, processing of fisheye images has to be supported. In this paper, we
describe the camera calibration and subsequent processing pipeline for
multi-fisheye-camera systems developed as part of the V-Charge project. This
project seeks to enable automated valet parking for self-driving cars. Our
pipeline is able to precisely calibrate multi-camera systems, build sparse 3D
maps for visual navigation, visually localize the car with respect to these
maps, generate accurate dense maps, as well as detect obstacles based on
real-time depth map extraction
Bio-inspired speed detection and discrimination
In the field of computer vision, a crucial task is the detection of motion
(also called optical flow extraction). This operation allows analysis such as
3D reconstruction, feature tracking, time-to-collision and novelty detection
among others. Most of the optical flow extraction techniques work within a
finite range of speeds. Usually, the range of detection is extended towards
higher speeds by combining some multiscale information in a serial
architecture. This serial multi-scale approach suffers from the problem of
error propagation related to the number of scales used in the algorithm. On the
other hand, biological experiments show that human motion perception seems to
follow a parallel multiscale scheme. In this work we present a bio-inspired
parallel architecture to perform detection of motion, providing a wide range of
operation and avoiding error propagation associated with the serial
architecture. To test our algorithm, we perform relative error comparisons
between both classical and proposed techniques, showing that the parallel
architecture is able to achieve motion detection with results similar to the
serial approach
End-to-end Learning of Driving Models from Large-scale Video Datasets
Robust perception-action models should be learned from training data with
diverse visual appearances and realistic behaviors, yet current approaches to
deep visuomotor policy learning have been generally limited to in-situ models
learned from a single vehicle or a simulation environment. We advocate learning
a generic vehicle motion model from large scale crowd-sourced video data, and
develop an end-to-end trainable architecture for learning to predict a
distribution over future vehicle egomotion from instantaneous monocular camera
observations and previous vehicle state. Our model incorporates a novel
FCN-LSTM architecture, which can be learned from large-scale crowd-sourced
vehicle action data, and leverages available scene segmentation side tasks to
improve performance under a privileged learning paradigm.Comment: camera ready for CVPR201
Attack on the clones: managing player perceptions of visual variety and believability in video game crowds
Crowds of non-player characters are increasingly common in contemporary video games. It is often the case that individual models are re-used, lowering visual variety in the crowd and potentially affecting realism and believability. This paper explores a number of approaches to increase visual diversity in large game crowds, and discusses a procedural solution for generating diverse non-player character models. This is evaluated using mixed methods, including a “clone spotting” activity and measurement of impact on computational overheads, in order to present a multi-faceted and adjustable solution to increase believability and variety in video game crowds
PAMPC: Perception-Aware Model Predictive Control for Quadrotors
We present the first perception-aware model predictive control framework for
quadrotors that unifies control and planning with respect to action and
perception objectives. Our framework leverages numerical optimization to
compute trajectories that satisfy the system dynamics and require control
inputs within the limits of the platform. Simultaneously, it optimizes
perception objectives for robust and reliable sens- ing by maximizing the
visibility of a point of interest and minimizing its velocity in the image
plane. Considering both perception and action objectives for motion planning
and control is challenging due to the possible conflicts arising from their
respective requirements. For example, for a quadrotor to track a reference
trajectory, it needs to rotate to align its thrust with the direction of the
desired acceleration. However, the perception objective might require to
minimize such rotation to maximize the visibility of a point of interest. A
model-based optimization framework, able to consider both perception and action
objectives and couple them through the system dynamics, is therefore necessary.
Our perception-aware model predictive control framework works in a
receding-horizon fashion by iteratively solving a non-linear optimization
problem. It is capable of running in real-time, fully onboard our lightweight,
small-scale quadrotor using a low-power ARM computer, to- gether with a
visual-inertial odometry pipeline. We validate our approach in experiments
demonstrating (I) the contradiction between perception and action objectives,
and (II) improved behavior in extremely challenging lighting conditions
- …