1,122 research outputs found
Photometric Depth Super-Resolution
This study explores the use of photometric techniques (shape-from-shading and
uncalibrated photometric stereo) for upsampling the low-resolution depth map
from an RGB-D sensor to the higher resolution of the companion RGB image. A
single-shot variational approach is first put forward, which is effective as
long as the target's reflectance is piecewise-constant. It is then shown that
this dependency upon a specific reflectance model can be relaxed by focusing on
a specific class of objects (e.g., faces), and delegate reflectance estimation
to a deep neural network. A multi-shot strategy based on randomly varying
lighting conditions is eventually discussed. It requires no training or prior
on the reflectance, yet this comes at the price of a dedicated acquisition
setup. Both quantitative and qualitative evaluations illustrate the
effectiveness of the proposed methods on synthetic and real-world scenarios.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(T-PAMI), 2019. First three authors contribute equall
A L1-TV Algorithm for Robust Perspective Photometric Stereo with Spatially-Varying Lightings
International audienceWe tackle the problem of perspective 3D-reconstruction of Lambertian surfaces through photometric stereo, in the presence of outliers to Lambertâs law, depth discontinuities, and unknown spatially-varying lightings. To this purpose, we introduce a robust L1-TV variational formulation of the recovery problem where the shape itself is the main unknown, which naturally enforces integrability and permits to avoid integrating the normal field
Detail-preserving and Content-aware Variational Multi-view Stereo Reconstruction
Accurate recovery of 3D geometrical surfaces from calibrated 2D multi-view
images is a fundamental yet active research area in computer vision. Despite
the steady progress in multi-view stereo reconstruction, most existing methods
are still limited in recovering fine-scale details and sharp features while
suppressing noises, and may fail in reconstructing regions with few textures.
To address these limitations, this paper presents a Detail-preserving and
Content-aware Variational (DCV) multi-view stereo method, which reconstructs
the 3D surface by alternating between reprojection error minimization and mesh
denoising. In reprojection error minimization, we propose a novel inter-image
similarity measure, which is effective to preserve fine-scale details of the
reconstructed surface and builds a connection between guided image filtering
and image registration. In mesh denoising, we propose a content-aware
-minimization algorithm by adaptively estimating the value and
regularization parameters based on the current input. It is much more promising
in suppressing noise while preserving sharp features than conventional
isotropic mesh smoothing. Experimental results on benchmark datasets
demonstrate that our DCV method is capable of recovering more surface details,
and obtains cleaner and more accurate reconstructions than state-of-the-art
methods. In particular, our method achieves the best results among all
published methods on the Middlebury dino ring and dino sparse ring datasets in
terms of both completeness and accuracy.Comment: 14 pages,16 figures. Submitted to IEEE Transaction on image
processin
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Keyframe-based monocular SLAM: design, survey, and future directions
Extensive research in the field of monocular SLAM for the past fifteen years
has yielded workable systems that found their way into various applications in
robotics and augmented reality. Although filter-based monocular SLAM systems
were common at some time, the more efficient keyframe-based solutions are
becoming the de facto methodology for building a monocular SLAM system. The
objective of this paper is threefold: first, the paper serves as a guideline
for people seeking to design their own monocular SLAM according to specific
environmental constraints. Second, it presents a survey that covers the various
keyframe-based monocular SLAM systems in the literature, detailing the
components of their implementation, and critically assessing the specific
strategies made in each proposed solution. Third, the paper provides insight
into the direction of future research in this field, to address the major
limitations still facing monocular SLAM; namely, in the issues of illumination
changes, initialization, highly dynamic motion, poorly textured scenes,
repetitive textures, map maintenance, and failure recovery
Magnetic-Visual Sensor Fusion-based Dense 3D Reconstruction and Localization for Endoscopic Capsule Robots
Reliable and real-time 3D reconstruction and localization functionality is a
crucial prerequisite for the navigation of actively controlled capsule
endoscopic robots as an emerging, minimally invasive diagnostic and therapeutic
technology for use in the gastrointestinal (GI) tract. In this study, we
propose a fully dense, non-rigidly deformable, strictly real-time,
intraoperative map fusion approach for actively controlled endoscopic capsule
robot applications which combines magnetic and vision-based localization, with
non-rigid deformations based frame-to-model map fusion. The performance of the
proposed method is demonstrated using four different ex-vivo porcine stomach
models. Across different trajectories of varying speed and complexity, and four
different endoscopic cameras, the root mean square surface reconstruction
errors 1.58 to 2.17 cm.Comment: submitted to IROS 201
- âŠ