786 research outputs found
Geometric-based Line Segment Tracking for HDR Stereo Sequences
In this work, we propose a purely geometrical approach for the robust matching of line segments for challenging stereo streams with severe illumination changes or High Dynamic Range (HDR) environments. To that purpose, we exploit the univocal nature of the matching problem, i.e. every observation must be corresponded with a single feature or not corresponded at all. We state the problem as a sparse, convex, `1-minimization of the matching vector regularized by the geometric constraints. This formulation allows for the robust tracking of line segments along sequences where traditional appearance-based matching techniques tend to fail due to dynamic changes in illumination conditions. Moreover, the proposed matching algorithm also results in a considerable speed-up of previous state of the art techniques making it suitable for real-time applications such as Visual Odometry (VO). This, of course, comes at expense of a slightly lower number of matches in comparison with appearance based methods, and also limits its application to continuous video sequences, as it is rather constrained to small pose increments between consecutive frames.We validate the claimed advantages by first evaluating the matching performance in challenging video sequences, and then testing the method in a benchmarked point and line based VO algorithm.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech.This work has been supported by the Spanish Government (project DPI2017-84827-R and grant BES-2015-071606) and by the Andalucian Government (project TEP2012-530)
デバイスの限界を超えた正確な撮像を可能にする深層学習
Tohoku University博士(情報科学)thesi
Efficient HDR Reconstruction from Real-World Raw Images
High dynamic range (HDR) imaging is still a significant yet challenging
problem due to the limited dynamic range of generic image sensors. Most
existing learning-based HDR reconstruction methods take a set of
bracketed-exposure sRGB images to extend the dynamic range, and thus are
computational- and memory-inefficient by requiring the Image Signal Processor
(ISP) to produce multiple sRGB images from the raw ones. In this paper, we
propose to broaden the dynamic range from the raw inputs and perform only one
ISP processing for the reconstructed HDR raw image. Our key insights are
threefold: (1) we design a new computational raw HDR data formation pipeline
and construct the first real-world raw HDR dataset, RealRaw-HDR; (2) we develop
a lightweight-efficient HDR model, RepUNet, using the structural
re-parameterization technique; (3) we propose a plug-and-play motion alignment
loss to mitigate motion misalignment between short- and long-exposure images.
Extensive experiments demonstrate that our approach achieves state-of-the-art
performance in both visual quality and quantitative metrics
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Stereoscopic high dynamic range imaging
Two modern technologies show promise to dramatically increase immersion in
virtual environments. Stereoscopic imaging captures two images representing
the views of both eyes and allows for better depth perception. High dynamic
range (HDR) imaging accurately represents real world lighting as opposed to
traditional low dynamic range (LDR) imaging. HDR provides a better contrast
and more natural looking scenes. The combination of the two technologies in
order to gain advantages of both has been, until now, mostly unexplored due to
the current limitations in the imaging pipeline. This thesis reviews both fields,
proposes stereoscopic high dynamic range (SHDR) imaging pipeline outlining the
challenges that need to be resolved to enable SHDR and focuses on capture and
compression aspects of that pipeline.
The problems of capturing SHDR images that would potentially require two
HDR cameras and introduce ghosting, are mitigated by capturing an HDR and
LDR pair and using it to generate SHDR images. A detailed user study compared
four different methods of generating SHDR images. Results demonstrated that
one of the methods may produce images perceptually indistinguishable from the
ground truth.
Insights obtained while developing static image operators guided the design
of SHDR video techniques. Three methods for generating SHDR video from an
HDR-LDR video pair are proposed and compared to the ground truth SHDR
videos. Results showed little overall error and identified a method with the least
error.
Once captured, SHDR content needs to be efficiently compressed. Five SHDR
compression methods that are backward compatible are presented. The proposed
methods can encode SHDR content to little more than that of a traditional single
LDR image (18% larger for one method) and the backward compatibility property
encourages early adoption of the format.
The work presented in this thesis has introduced and advanced capture and
compression methods for the adoption of SHDR imaging. In general, this research
paves the way for a novel field of SHDR imaging which should lead to improved
and more realistic representation of captured scenes
Estimating general motion and intensity from event cameras
Robotic vision algorithms have become widely used in many consumer products which
enabled technologies such as autonomous vehicles, drones, augmented reality (AR) and
virtual reality (VR) devices to name a few. These applications require vision algorithms
to work in real-world environments with extreme lighting variations and fast moving
objects. However, robotic vision applications rely often on standard video cameras which
face severe limitations in fast-moving scenes or by bright light sources which diminish
the image quality with artefacts like motion blur or over-saturation.
To address these limitations, the body of work presented here investigates the use of
alternative sensor devices which mimic the superior perception properties of human
vision. Such silicon retinas were proposed by neuromorphic engineering, and we focus
here on one such biologically inspired sensor called the event camera which offers a new
camera paradigm for real-time robotic vision. The camera provides a high measurement
rate, low latency, high dynamic range, and low data rate. The signal of the camera is
composed of a stream of asynchronous events at microsecond resolution. Each event
indicates when individual pixels registers a logarithmic intensity changes of a pre-set
threshold size. Using this novel signal has proven to be very challenging in most computer
vision problems since common vision methods require synchronous absolute intensity
information.
In this thesis, we present for the first time a method to reconstruct an image and es-
timation motion from an event stream without additional sensing or prior knowledge of
the scene. This method is based on coupled estimations of both motion and intensity
which enables our event-based analysis, which was previously only possible with severe
limitations. We also present the first machine learning algorithm for event-based unsu-
pervised intensity reconstruction which does not depend on an explicit motion estimation
and reveals finer image details. This learning approach does not rely on event-to-image
examples, but learns from standard camera image examples which are not coupled to the
event data. In experiments we show that the learned reconstruction improves upon our
handcrafted approach. Finally, we combine our learned approach with motion estima-
tion methods and show the improved intensity reconstruction also significantly improves
the motion estimation results. We hope our work in this thesis bridges the gap between
the event signal and images and that it opens event cameras to practical solutions to
overcome the current limitations of frame-based cameras in robotic vision.Open Acces
A robust patch-based synthesis framework for combining inconsistent images
Current methods for combining different images produce visible artifacts when the sources have very different textures and structures, come from far view points, or capture dynamic scenes with motions. In this thesis, we propose a patch-based synthesis algorithm to plausibly combine different images that have color, texture, structural, and geometric inconsistencies. For some applications such as cloning and stitching where a gradual blend is required, we present a new method for synthesizing a transition region between two source images, such that inconsistent properties change gradually from one source to the other. We call this process image melding. For gradual blending, we generalized patch-based optimization foundation with three key generalizations: First, we enrich the patch search space with additional geometric and photometric transformations. Second, we integrate image gradients into the patch representation and replace the usual color averaging with a screened Poisson equation solver. Third, we propose a new energy based on mixed L2/L0 norms for colors and gradients that produces a gradual transition between sources without sacrificing texture sharpness. Together, all three generalizations enable patch-based solutions to a broad class of image melding problems involving inconsistent sources: object cloning, stitching challenging panoramas, hole filling from multiple photos, and image harmonization. We also demonstrate another application which requires us to address inconsistencies across the images: high dynamic range (HDR) reconstruction using sequential exposures. In this application, the results will suffer from objectionable artifacts for dynamic scenes if the inconsistencies caused by significant scene motions are not handled properly. In this thesis, we propose a new approach to HDR reconstruction that uses information in all exposures while being more robust to motion than previous techniques. Our algorithm is based on a novel patch-based energy-minimization formulation that integrates alignment and reconstruction in a joint optimization through an equation we call the HDR image synthesis equation. This allows us to produce an HDR result that is aligned to one of the exposures yet contains information from all of them. These two applications (image melding and high dynamic range reconstruction) show that patch based methods like the one proposed in this dissertation can address inconsistent images and could open the door to many new image editing applications in the future
- …