456 research outputs found
Lossless Compression of Neuromorphic Vision Sensor Data Based on Point Cloud Representation
Visual information varying over time is typically captured by cameras that acquire data via images (frames) equally spaced in time. Using a different approach, Neuromorphic Vision Sensors (NVSs) are emerging visual capturing devices that only acquire information when changes occur in the scene. This results in major advantages in terms of low power consumption, wide dynamic range, high temporal resolution, and lower data rates than conventional video. Although the acquisition strategy already results in much lower data rates than conventional video, such data can be further compressed. To this end, in this paper we propose a lossless compression strategy based on point cloud compression, inspired by the observation that, by appropriately reporting NVS data in a tridimensional space, we have a point cloud representation of NVS data. The proposed strategy outperforms the benchmark strategies resulting in a compression ratio up to 30% higher for the considered
Distributed Video Coding: Selecting the Most Promising Application Scenarios
Distributed Video Coding (DVC) is a new video coding paradigm based on two major Information Theory results: the Slepian–Wolf and Wyner–Ziv theorems. Recently, practical DVC solutions have been proposed with promising results; however, there is still a need to study in a more systematic way the set of application scenarios for which DVC may bring major advantages. This paper intends to contribute for the identification of the most DVC friendly application scenarios, highlighting the expected benefits and drawbacks for each studied scenario. This selection is based on a proposed methodology which involves the characterization and clustering of the applications according to their most relevant characteristics, and their matching with the main potential DVC benefits
Spiking sampling network for image sparse representation and dynamic vision sensor data compression
Sparse representation has attracted great attention because it can greatly
save storage resources and find representative features of data in a
low-dimensional space. As a result, it may be widely applied in engineering
domains including feature extraction, compressed sensing, signal denoising,
picture clustering, and dictionary learning, just to name a few. In this paper,
we propose a spiking sampling network. This network is composed of spiking
neurons, and it can dynamically decide which pixel points should be retained
and which ones need to be masked according to the input. Our experiments
demonstrate that this approach enables better sparse representation of the
original image and facilitates image reconstruction compared to random
sampling. We thus use this approach for compressing massive data from the
dynamic vision sensor, which greatly reduces the storage requirements for event
data
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Algorithms for compression of high dynamic range images and video
The recent advances in sensor and display technologies have brought upon the High Dynamic Range (HDR) imaging capability. The modern multiple exposure HDR sensors can achieve the dynamic range of 100-120 dB and LED and OLED display devices have contrast ratios of 10^5:1 to 10^6:1.
Despite the above advances in technology the image/video compression algorithms and associated hardware are yet based on Standard Dynamic Range (SDR) technology, i.e. they operate within an effective dynamic range of up to 70 dB for 8 bit gamma corrected images. Further the existing infrastructure for content distribution is also designed for SDR, which creates interoperability problems with true HDR capture and display equipment.
The current solutions for the above problem include tone mapping the HDR content to fit SDR. However this approach leads to image quality associated problems, when strong dynamic range compression is applied. Even though some HDR-only solutions have been proposed in literature, they are not interoperable with current SDR infrastructure and are thus typically used in closed systems.
Given the above observations a research gap was identified in the need for efficient algorithms for the compression of still images and video, which are capable of storing full dynamic range and colour gamut of HDR images and at the same time backward compatible with existing SDR infrastructure. To improve the usability of SDR content it is vital that any such algorithms should accommodate different tone mapping operators, including those that are spatially non-uniform.
In the course of the research presented in this thesis a novel two layer CODEC architecture is introduced for both HDR image and video coding. Further a universal and computationally efficient approximation of the tone mapping operator is developed and presented. It is shown that the use of perceptually uniform colourspaces for internal representation of pixel data enables improved compression efficiency of the algorithms. Further proposed novel approaches to the compression of metadata for the tone mapping operator is shown to improve compression performance for low bitrate video content. Multiple compression algorithms are designed, implemented and compared and quality-complexity trade-offs are identified. Finally practical aspects of implementing the developed algorithms are explored by automating the design space exploration flow and integrating the high level systems design framework with domain specific tools for synthesis and simulation of multiprocessor systems. The directions for further work are also presented
- …