287,744 research outputs found
Estimating general motion and intensity from event cameras
Robotic vision algorithms have become widely used in many consumer products which
enabled technologies such as autonomous vehicles, drones, augmented reality (AR) and
virtual reality (VR) devices to name a few. These applications require vision algorithms
to work in real-world environments with extreme lighting variations and fast moving
objects. However, robotic vision applications rely often on standard video cameras which
face severe limitations in fast-moving scenes or by bright light sources which diminish
the image quality with artefacts like motion blur or over-saturation.
To address these limitations, the body of work presented here investigates the use of
alternative sensor devices which mimic the superior perception properties of human
vision. Such silicon retinas were proposed by neuromorphic engineering, and we focus
here on one such biologically inspired sensor called the event camera which offers a new
camera paradigm for real-time robotic vision. The camera provides a high measurement
rate, low latency, high dynamic range, and low data rate. The signal of the camera is
composed of a stream of asynchronous events at microsecond resolution. Each event
indicates when individual pixels registers a logarithmic intensity changes of a pre-set
threshold size. Using this novel signal has proven to be very challenging in most computer
vision problems since common vision methods require synchronous absolute intensity
information.
In this thesis, we present for the first time a method to reconstruct an image and es-
timation motion from an event stream without additional sensing or prior knowledge of
the scene. This method is based on coupled estimations of both motion and intensity
which enables our event-based analysis, which was previously only possible with severe
limitations. We also present the first machine learning algorithm for event-based unsu-
pervised intensity reconstruction which does not depend on an explicit motion estimation
and reveals finer image details. This learning approach does not rely on event-to-image
examples, but learns from standard camera image examples which are not coupled to the
event data. In experiments we show that the learned reconstruction improves upon our
handcrafted approach. Finally, we combine our learned approach with motion estima-
tion methods and show the improved intensity reconstruction also significantly improves
the motion estimation results. We hope our work in this thesis bridges the gap between
the event signal and images and that it opens event cameras to practical solutions to
overcome the current limitations of frame-based cameras in robotic vision.Open Acces
Cloud Chaser: Real Time Deep Learning Computer Vision on Low Computing Power Devices
Internet of Things(IoT) devices, mobile phones, and robotic systems are often
denied the power of deep learning algorithms due to their limited computing
power. However, to provide time-critical services such as emergency response,
home assistance, surveillance, etc, these devices often need real-time analysis
of their camera data. This paper strives to offer a viable approach to
integrate high-performance deep learning-based computer vision algorithms with
low-resource and low-power devices by leveraging the computing power of the
cloud. By offloading the computation work to the cloud, no dedicated hardware
is needed to enable deep neural networks on existing low computing power
devices. A Raspberry Pi based robot, Cloud Chaser, is built to demonstrate the
power of using cloud computing to perform real-time vision tasks. Furthermore,
to reduce latency and improve real-time performance, compression algorithms are
proposed and evaluated for streaming real-time video frames to the cloud.Comment: Accepted to The 11th International Conference on Machine Vision (ICMV
2018). Project site: https://zhengyiluo.github.io/projects/cloudchaser
Adaptive foveated single-pixel imaging with dynamic super-sampling
As an alternative to conventional multi-pixel cameras, single-pixel cameras
enable images to be recorded using a single detector that measures the
correlations between the scene and a set of patterns. However, to fully sample
a scene in this way requires at least the same number of correlation
measurements as there are pixels in the reconstructed image. Therefore
single-pixel imaging systems typically exhibit low frame-rates. To mitigate
this, a range of compressive sensing techniques have been developed which rely
on a priori knowledge of the scene to reconstruct images from an under-sampled
set of measurements. In this work we take a different approach and adopt a
strategy inspired by the foveated vision systems found in the animal kingdom -
a framework that exploits the spatio-temporal redundancy present in many
dynamic scenes. In our single-pixel imaging system a high-resolution foveal
region follows motion within the scene, but unlike a simple zoom, every frame
delivers new spatial information from across the entire field-of-view. Using
this approach we demonstrate a four-fold reduction in the time taken to record
the detail of rapidly evolving features, whilst simultaneously accumulating
detail of more slowly evolving regions over several consecutive frames. This
tiered super-sampling technique enables the reconstruction of video streams in
which both the resolution and the effective exposure-time spatially vary and
adapt dynamically in response to the evolution of the scene. The methods
described here can complement existing compressive sensing approaches and may
be applied to enhance a variety of computational imagers that rely on
sequential correlation measurements.Comment: 13 pages, 5 figure
- …