6,280 research outputs found
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Long-Term Localization using Semantic Cues in Floor Plan Maps
Lifelong localization in a given map is an essential capability for
autonomous service robots. In this paper, we consider the task of long-term
localization in a changing indoor environment given sparse CAD floor plans. The
commonly used pre-built maps from the robot sensors may increase the cost and
time of deployment. Furthermore, their detailed nature requires that they are
updated when significant changes occur. We address the difficulty of
localization when the correspondence between the map and the observations is
low due to the sparsity of the CAD map and the changing environment. To
overcome both challenges, we propose to exploit semantic cues that are commonly
present in human-oriented spaces. These semantic cues can be detected using RGB
cameras by utilizing object detection, and are matched against an
easy-to-update, abstract semantic map. The semantic information is integrated
into a Monte Carlo localization framework using a particle filter that operates
on 2D LiDAR scans and camera data. We provide a long-term localization solution
and a semantic map format, for environments that undergo changes to their
interior structure and detailed geometric maps are not available. We evaluate
our localization framework on multiple challenging indoor scenarios in an
office environment, taken weeks apart. The experiments suggest that our
approach is robust to structural changes and can run on an onboard computer. We
released the open source implementation of our approach written in C++ together
with a ROS wrapper.Comment: Under review for RA-
Real-time 3D reconstruction of non-rigid shapes with a single moving camera
© . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/This paper describes a real-time sequential method to simultaneously recover the camera motion and the 3D shape of deformable objects from a calibrated monocular video. For this purpose, we consider the Navier-Cauchy equations used in 3D linear elasticity and solved by finite elements, to model the time-varying shape per frame. These equations are embedded in an extended Kalman filter, resulting in sequential Bayesian estimation approach. We represent the shape, with unknown material properties, as a combination of elastic elements whose nodal points correspond to salient points in the image. The global rigidity of the shape is encoded by a stiffness matrix, computed after assembling each of these elements. With this piecewise model, we can linearly relate the 3D displacements with the 3D acting forces that cause the object deformation, assumed to be normally distributed. While standard finite-element-method techniques require imposing boundary conditions to solve the resulting linear system, in this work we eliminate this requirement by modeling the compliance matrix with a generalized pseudoinverse that enforces a pre-fixed rank. Our framework also ensures surface continuity without the need for a post-processing step to stitch all the piecewise reconstructions into a global smooth shape. We present experimental results using both synthetic and real videos for different scenarios ranging from isometric to elastic deformations. We also show the consistency of the estimation with respect to 3D ground truth data, include several experiments assessing robustness against artifacts and finally, provide an experimental validation of our performance in real time at frame rate for small mapsPeer ReviewedPostprint (author's final draft
- …