511 research outputs found
Estimating Sensor Motion from Wide-Field Optical Flow on a Log-Dipolar Sensor
Log-polar image architectures, motivated by the structure of the human visual field, have long been investigated in computer vision for use in estimating motion parameters from an optical flow vector field. Practical problems with this approach have been: (i) dependence on assumed alignment of the visual and motion axes; (ii) sensitivity to occlusion form moving and stationary objects in the central visual field, where much of the numerical sensitivity is concentrated; and (iii) inaccuracy of the log-polar architecture (which is an approximation to the central 20°) for wide-field biological vision. In the present paper, we show that an algorithm based on generalization of the log-polar architecture; termed the log-dipolar sensor, provides a large improvement in performance relative to the usual log-polar sampling. Specifically, our algorithm: (i) is tolerant of large misalignmnet of the optical and motion axes; (ii) is insensitive to significant occlusion by objects of unknown motion; and (iii) represents a more correct analogy to the wide-field structure of human vision. Using the Helmholtz-Hodge decomposition to estimate the optical flow vector field on a log-dipolar sensor, we demonstrate these advantages, using synthetic optical flow maps as well as natural image sequences
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Event-based Simultaneous Localization and Mapping: A Comprehensive Survey
In recent decades, visual simultaneous localization and mapping (vSLAM) has
gained significant interest in both academia and industry. It estimates camera
motion and reconstructs the environment concurrently using visual sensors on a
moving robot. However, conventional cameras are limited by hardware, including
motion blur and low dynamic range, which can negatively impact performance in
challenging scenarios like high-speed motion and high dynamic range
illumination. Recent studies have demonstrated that event cameras, a new type
of bio-inspired visual sensor, offer advantages such as high temporal
resolution, dynamic range, low power consumption, and low latency. This paper
presents a timely and comprehensive review of event-based vSLAM algorithms that
exploit the benefits of asynchronous and irregular event streams for
localization and mapping tasks. The review covers the working principle of
event cameras and various event representations for preprocessing event data.
It also categorizes event-based vSLAM methods into four main categories:
feature-based, direct, motion-compensation, and deep learning methods, with
detailed discussions and practical guidance for each approach. Furthermore, the
paper evaluates the state-of-the-art methods on various benchmarks,
highlighting current challenges and future opportunities in this emerging
research area. A public repository will be maintained to keep track of the
rapid developments in this field at
{\url{https://github.com/kun150kun/ESLAM-survey}}
Pose Graph Optimization for Unsupervised Monocular Visual Odometry
Unsupervised Learning based monocular visual odometry (VO) has lately drawn
significant attention for its potential in label-free leaning ability and
robustness to camera parameters and environmental variations. However,
partially due to the lack of drift correction technique, these methods are
still by far less accurate than geometric approaches for large-scale odometry
estimation. In this paper, we propose to leverage graph optimization and loop
closure detection to overcome limitations of unsupervised learning based
monocular visual odometry. To this end, we propose a hybrid VO system which
combines an unsupervised monocular VO called NeuralBundler with a pose graph
optimization back-end. NeuralBundler is a neural network architecture that uses
temporal and spatial photometric loss as main supervision and generates a
windowed pose graph consists of multi-view 6DoF constraints. We propose a novel
pose cycle consistency loss to relieve the tensions in the windowed pose graph,
leading to improved performance and robustness. In the back-end, a global pose
graph is built from local and loop 6DoF constraints estimated by NeuralBundler
and is optimized over SE(3). Empirical evaluation on the KITTI odometry dataset
demonstrates that 1) NeuralBundler achieves state-of-the-art performance on
unsupervised monocular VO estimation, and 2) our whole approach can achieve
efficient loop closing and show favorable overall translational accuracy
compared to established monocular SLAM systems.Comment: Accepted to ICRA'201
Computational themes in applications of visual perception
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/76646/1/AIAA-1987-1674-988.pd
- …