453 research outputs found
Learning-based Image Enhancement for Visual Odometry in Challenging HDR Environments
One of the main open challenges in visual odometry (VO) is the robustness to
difficult illumination conditions or high dynamic range (HDR) environments. The
main difficulties in these situations come from both the limitations of the
sensors and the inability to perform a successful tracking of interest points
because of the bold assumptions in VO, such as brightness constancy. We address
this problem from a deep learning perspective, for which we first fine-tune a
Deep Neural Network (DNN) with the purpose of obtaining enhanced
representations of the sequences for VO. Then, we demonstrate how the insertion
of Long Short Term Memory (LSTM) allows us to obtain temporally consistent
sequences, as the estimation depends on previous states. However, the use of
very deep networks does not allow the insertion into a real-time VO framework;
therefore, we also propose a Convolutional Neural Network (CNN) of reduced size
capable of performing faster. Finally, we validate the enhanced representations
by evaluating the sequences produced by the two architectures in several
state-of-art VO algorithms, such as ORB-SLAM and DSO
Feature point detection in HDR images based on coefficient of variation
Feature point (FP) detection is a fundamental step of many computer vision
tasks. However, FP detectors are usually designed for low dynamic range (LDR)
images. In scenes with extreme light conditions, LDR images present saturated
pixels, which degrade FP detection. On the other hand, high dynamic range (HDR)
images usually present no saturated pixels but FP detection algorithms do not
take advantage of all the information present in such images. FP detection
frequently relies on differential methods, which work well in LDR images.
However, in HDR images, the differential operation response in bright areas
overshadows the response in dark areas. As an alternative to standard FP
detection methods, this study proposes an FP detector based on a coefficient of
variation (CV) designed for HDR images. The CV operation adapts its response
based on the standard deviation of pixels inside a window, working well in both
dark and bright areas of HDR images. The proposed and standard detectors are
evaluated by measuring their repeatability rate (RR) and uniformity. Our
proposed detector shows better performance when compared to other standard
state-of-the-art detectors. In uniformity metric, our proposed detector
surpasses all the other algorithms. In other hand, when using the repeatability
rate metric, the proposed detector is worse than Harris for HDR and SURF
detectors
Pix2HDR -- A pixel-wise acquisition and deep learning-based synthesis approach for high-speed HDR videos
Accurately capturing dynamic scenes with wide-ranging motion and light
intensity is crucial for many vision applications. However, acquiring
high-speed high dynamic range (HDR) video is challenging because the camera's
frame rate restricts its dynamic range. Existing methods sacrifice speed to
acquire multi-exposure frames. Yet, misaligned motion in these frames can still
pose complications for HDR fusion algorithms, resulting in artifacts. Instead
of frame-based exposures, we sample the videos using individual pixels at
varying exposures and phase offsets. Implemented on a pixel-wise programmable
image sensor, our sampling pattern simultaneously captures fast motion at a
high dynamic range. We then transform pixel-wise outputs into an HDR video
using end-to-end learned weights from deep neural networks, achieving high
spatiotemporal resolution with minimized motion blurring. We demonstrate
aliasing-free HDR video acquisition at 1000 FPS, resolving fast motion under
low-light conditions and against bright backgrounds - both challenging
conditions for conventional cameras. By combining the versatility of pixel-wise
sampling patterns with the strength of deep neural networks at decoding complex
scenes, our method greatly enhances the vision system's adaptability and
performance in dynamic conditions.Comment: 14 pages, 14 figure
An Asynchronous Kalman Filter for Hybrid Event Cameras
Event cameras are ideally suited to capture HDR visual information without
blur but perform poorly on static or slowly changing scenes. Conversely,
conventional image sensors measure absolute intensity of slowly changing scenes
effectively but do poorly on high dynamic range or quickly changing scenes. In
this paper, we present an event-based video reconstruction pipeline for High
Dynamic Range (HDR) scenarios. The proposed algorithm includes a frame
augmentation pre-processing step that deblurs and temporally interpolates frame
data using events. The augmented frame and event data are then fused using a
novel asynchronous Kalman filter under a unifying uncertainty model for both
sensors. Our experimental results are evaluated on both publicly available
datasets with challenging lighting conditions and fast motions and our new
dataset with HDR reference. The proposed algorithm outperforms state-of-the-art
methods in both absolute intensity error (48% reduction) and image similarity
indexes (average 11% improvement).Comment: 12 pages, 6 figures, published in International Conference on
Computer Vision (ICCV) 202
An Asynchronous Linear Filter Architecture for Hybrid Event-Frame Cameras
Event cameras are ideally suited to capture High Dynamic Range (HDR) visual
information without blur but provide poor imaging capability for static or
slowly varying scenes. Conversely, conventional image sensors measure absolute
intensity of slowly changing scenes effectively but do poorly on HDR or quickly
changing scenes. In this paper, we present an asynchronous linear filter
architecture, fusing event and frame camera data, for HDR video reconstruction
and spatial convolution that exploits the advantages of both sensor modalities.
The key idea is the introduction of a state that directly encodes the
integrated or convolved image information and that is updated asynchronously as
each event or each frame arrives from the camera. The state can be read-off
as-often-as and whenever required to feed into subsequent vision modules for
real-time robotic systems. Our experimental results are evaluated on both
publicly available datasets with challenging lighting conditions and fast
motions, along with a new dataset with HDR reference that we provide. The
proposed AKF pipeline outperforms other state-of-the-art methods in both
absolute intensity error (69.4% reduction) and image similarity indexes
(average 35.5% improvement). We also demonstrate the integration of image
convolution with linear spatial kernels Gaussian, Sobel, and Laplacian as an
application of our architecture.Comment: 17 pages, 10 figures, Accepted by IEEE Transactions on Pattern
Analysis and Machine Intelligence (TPAMI) in August 202
On robust optical flow estimation on image sequences with differently exposed frames using primal-dual optimization
Optical flow methods are used to estimate pixelwise motion information based on consecutive frames in image sequences. The image sequences traditionally contain frames that are similarly exposed. However, many real-world scenes contain high dynamic range content that cannot be captured well with a single exposure setting. Such scenes result in certain image regions being over- or underexposed, which can negatively impact the quality of motion estimates in those regions. Motivated by this, we propose to capture high dynamic range scenes using different exposure settings every other frame. A framework for OF estimation on such image sequences is presented, that can straightforwardly integrate techniques from the state-of-the-art in conventional OF methods. Different aspects of robustness of OF methods are discussed, including estimation of large displacements and robustness to natural illumination changes that occur between the frames, and we demonstrate experimentally how to handle such challenging flow estimation scenarios. The flow estimation is formulated as an optimization problem whose solution is obtained using an efficient primal–dual method
Neural Radiance Fields: Past, Present, and Future
The various aspects like modeling and interpreting 3D environments and
surroundings have enticed humans to progress their research in 3D Computer
Vision, Computer Graphics, and Machine Learning. An attempt made by Mildenhall
et al in their paper about NeRFs (Neural Radiance Fields) led to a boom in
Computer Graphics, Robotics, Computer Vision, and the possible scope of
High-Resolution Low Storage Augmented Reality and Virtual Reality-based 3D
models have gained traction from res with more than 1000 preprints related to
NeRFs published. This paper serves as a bridge for people starting to study
these fields by building on the basics of Mathematics, Geometry, Computer
Vision, and Computer Graphics to the difficulties encountered in Implicit
Representations at the intersection of all these disciplines. This survey
provides the history of rendering, Implicit Learning, and NeRFs, the
progression of research on NeRFs, and the potential applications and
implications of NeRFs in today's world. In doing so, this survey categorizes
all the NeRF-related research in terms of the datasets used, objective
functions, applications solved, and evaluation criteria for these applications.Comment: 413 pages, 9 figures, 277 citation
- …