Search CORE

19,597 research outputs found

Estimating Depth from RGB and Sparse Sensing

Author: A Teichman
Anat Levin
DG Lowe
FH Sinz
K Khoshelham
N Silberman
R Garg
R Horaud
T Whelan
T-W Hui
V Badrinarayanan
X Song
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/12/2018
Field of study

We present a deep model that can accurately produce dense depth maps given an RGB image with known depth at a very sparse set of pixels. The model works simultaneously for both indoor/outdoor scenes and produces state-of-the-art dense depth maps at nearly real-time speeds on both the NYUv2 and KITTI datasets. We surpass the state-of-the-art for monocular depth estimation even with depth values for only 1 out of every ~10000 image pixels, and we outperform other sparse-to-dense depth methods at all sparsity levels. With depth values for 1/256 of the image pixels, we achieve a mean absolute error of less than 1% of actual depth on indoor scenes, comparable to the performance of consumer-grade depth sensor hardware. Our experiments demonstrate that it would indeed be possible to efficiently transform sparse depth measurements obtained using e.g. lower-power depth sensors or SLAM systems into high-quality dense depth maps.Comment: European Conference on Computer Vision (ECCV) 2018. Updated to camera-ready version with additional experiment

arXiv.org e-Print Archive

Crossref

Pop-up SLAM: Semantic Monocular Plane SLAM for Low-texture Environments

Author: Kaess Michael
Scherer Sebastian
Song Yu
Yang Shichao
Publication venue
Publication date: 21/03/2017
Field of study

Existing simultaneous localization and mapping (SLAM) algorithms are not robust in challenging low-texture environments because there are only few salient features. The resulting sparse or semi-dense map also conveys little information for motion planning. Though some work utilize plane or scene layout for dense map regularization, they require decent state estimation from other sources. In this paper, we propose real-time monocular plane SLAM to demonstrate that scene understanding could improve both state estimation and dense mapping especially in low-texture environments. The plane measurements come from a pop-up 3D plane model applied to each single image. We also combine planes with point based SLAM to improve robustness. On a public TUM dataset, our algorithm generates a dense semantic 3D model with pixel depth error of 6.2 cm while existing SLAM algorithms fail. On a 60 m long dataset with loops, our method creates a much better 3D model with state estimation error of 0.67%.Comment: International Conference on Intelligent Robots and Systems (IROS) 201

arXiv.org e-Print Archive

Crossref

Event-based Vision: A Survey

Author: Bartolozzi Chiara
Censi Andrea
Conradt Joerg
Daniilidis Kostas
Davison Andrew
Delbruck Tobi
Gallego Guillermo
Leutenegger Stefan
Orchard Garrick
Scaramuzza Davide
Taba Brian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

ZORA

Estimating Sensor Motion from Wide-Field Optical Flow on a Log-Dipolar Sensor

Author: Polimeni Jonathan
Schwartz Eric
Wagner Robert
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/12/2004
Field of study

Log-polar image architectures, motivated by the structure of the human visual field, have long been investigated in computer vision for use in estimating motion parameters from an optical flow vector field. Practical problems with this approach have been: (i) dependence on assumed alignment of the visual and motion axes; (ii) sensitivity to occlusion form moving and stationary objects in the central visual field, where much of the numerical sensitivity is concentrated; and (iii) inaccuracy of the log-polar architecture (which is an approximation to the central 20°) for wide-field biological vision. In the present paper, we show that an algorithm based on generalization of the log-polar architecture; termed the log-dipolar sensor, provides a large improvement in performance relative to the usual log-polar sampling. Specifically, our algorithm: (i) is tolerant of large misalignmnet of the optical and motion axes; (ii) is insensitive to significant occlusion by objects of unknown motion; and (iii) represents a more correct analogy to the wide-field structure of human vision. Using the Helmholtz-Hodge decomposition to estimate the optical flow vector field on a log-dipolar sensor, we demonstrate these advantages, using synthetic optical flow maps as well as natural image sequences

Boston University Institutional Repository (OpenBU)

Smear correction of highly-variable, frame-transfer-CCD images with application to polarimetry

Author: Feller Alex
Iglesias Francisco A.
Nagaraju K.
Publication venue: 'The Optical Society'
Publication date: 01/01/2015
Field of study

Image smear, produced by the shutter-less operation of frame transfer CCD detectors, can be detrimental for many imaging applications. Existing algorithms used to numerically remove smear, do not contemplate cases where intensity levels change considerably between consecutive frame exposures. In this report we reformulate the smearing model to include specific variations of the sensor illumination. The corresponding desmearing expression and its noise properties are also presented and demonstrated in the context of fast imaging polarimetry.Comment: Article accepted for publication in Applied Optics on 08 Jun 201

arXiv.org e-Print Archive

MPG.PuRe

Extended Object Tracking: Introduction, Overview and Applications

Author: Baum Marcus
Granstrom Karl
Reuter Stephan
Publication venue
Publication date: 01/01/2017
Field of study

This article provides an elaborate overview of current research in extended object tracking. We provide a clear definition of the extended object tracking problem and discuss its delimitation to other types of object tracking. Next, different aspects of extended object modelling are extensively discussed. Subsequently, we give a tutorial introduction to two basic and well used extended object tracking approaches - the random matrix approach and the Kalman filter-based approach for star-convex shapes. The next part treats the tracking of multiple extended objects and elaborates how the large number of feasible association hypotheses can be tackled using both Random Finite Set (RFS) and Non-RFS multi-object trackers. The article concludes with a summary of current applications, where four example applications involving camera, X-band radar, light detection and ranging (lidar), red-green-blue-depth (RGB-D) sensors are highlighted.Comment: 30 pages, 19 figure

arXiv.org e-Print Archive

Chalmers Research

Chalmers Publication Library

Rule Of Thumb: Deep derotation for improved fingertip detection

Author: Kimmel Ron
Slossberg Ron
Wetzler Aaron
Publication venue
Publication date: 01/01/2015
Field of study

We investigate a novel global orientation regression approach for articulated objects using a deep convolutional neural network. This is integrated with an in-plane image derotation scheme, DeROT, to tackle the problem of per-frame fingertip detection in depth images. The method reduces the complexity of learning in the space of articulated poses which is demonstrated by using two distinct state-of-the-art learning based hand pose estimation methods applied to fingertip detection. Significant classification improvements are shown over the baseline implementation. Our framework involves no tracking, kinematic constraints or explicit prior model of the articulated object in hand. To support our approach we also describe a new pipeline for high accuracy magnetic annotation and labeling of objects imaged by a depth camera.Comment: To be published in proceedings of BMVC 201

arXiv.org e-Print Archive

Crossref

Enhancing Compressed Sensing 4D Photoacoustic Tomography by Simultaneous Motion Estimation

Author: Arridge Simon
Beard Paul
Betcke Marta
Cox Ben
Huynh Nam
Lucka Felix
Zhang Edward
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2018
Field of study

A crucial limitation of current high-resolution 3D photoacoustic tomography (PAT) devices that employ sequential scanning is their long acquisition time. In previous work, we demonstrated how to use compressed sensing techniques to improve upon this: images with good spatial resolution and contrast can be obtained from suitably sub-sampled PAT data acquired by novel acoustic scanning systems if sparsity-constrained image reconstruction techniques such as total variation regularization are used. Now, we show how a further increase of image quality can be achieved for imaging dynamic processes in living tissue (4D PAT). The key idea is to exploit the additional temporal redundancy of the data by coupling the previously used spatial image reconstruction models with sparsity-constrained motion estimation models. While simulated data from a two-dimensional numerical phantom will be used to illustrate the main properties of this recently developed joint-image-reconstruction-and-motion-estimation framework, measured data from a dynamic experimental phantom will also be used to demonstrate their potential for challenging, large-scale, real-world, three-dimensional scenarios. The latter only becomes feasible if a carefully designed combination of tailored optimization schemes is employed, which we describe and examine in more detail

arXiv.org e-Print Archive

CWI's Institutional Repository

UCL Discovery