15 research outputs found
DPCM-based edge prediction for lossless screen content coding in HEVC
Screen content sequences are ubiquitous type of video data in numerous multimedia applications like video conferencing, remote education, and cloud gaming. These sequences are characterized for depicting a mix of computer generated graphics, text, and camera-captured material. Such a mix poses several challenges, as the content usually depicts multiple strong discontinuities, which are hard to encode using current techniques. Differential pulse code modulation (DPCM)-based intra-prediction has shown to improve coding efficiency for these sequences. In this paper we propose sample-based edge and angular prediction (SEAP), a collection of DPCM-based intra-prediction modes to improve lossless coding of screen content. SEAP is aimed at accurately predicting regions depicting not only camera-captured material, but also those depicting strong edges. It incorporates modes that allow selecting the best predictor for each pixel individually based on the characteristics of the causal neighborhood of the target pixel. We incorporate SEAP into HEVC intra-prediction. Evaluation results on various screen content sequences show the advantages of SEAP over other DPCM-based approaches, with bit-rate reductions of up to 19.56% compared to standardized RDPCM. When used in conjunction with the coding tools of the screen content coding extensions, SEAP provides bit-rate reductions of up to 8.63% compared to RDPCM
Piecewise mapping in HEVC lossless intra-prediction coding
The lossless intra-prediction coding modality of the High Efficiency Video Coding (HEVC) standard provides high coding performance while following frame-by-frame basis access to the coded data. This is of interest in many professional applications such as medical imaging, automotive vision and digital preservation in libraries and archives. Various improvements to lossless intra-prediction coding have been
proposed recently, most of them based on sample-wise prediction using Differential Pulse Code Modulation (DPCM). Other recent proposals aim at further reducing the energy of intra-predicted residual blocks. However, the energy reduction achieved is frequently minimal due to the difficulty of correctly predicting the sign and magnitude of residual values. In this paper, we pursue a novel approach to this energy-reduction problem using piecewise mapping (pwm) functions. Specifically, we analyze the range of values in residual blocks and apply accordingly a pwm function to map specific residual values to unique lower values. We encode appropriate parameters associated with the pwm functions at the encoder, so that the corresponding inverse pwm
functions at the decoder can map values back to the same residual values. These residual values are then used to reconstruct the original signal. This mapping is, therefore, reversible and introduces no losses. We evaluate the pwm functions on 4×4 residual blocks computed after DPCM-based prediction for lossless coding of a variety of camera-captured and screen content sequences. Evaluation results show that the pwm functions can attain maximum bit-rate reductions of 5.54% and 28.33% for screen content material compared to DPCM-based
and block-wise intra-prediction, respectively. Compared to IntraBlock
Copy, piecewise mapping can attain maximum bit-rate reductions of 11.48% for camera-captured material
Contributions to HEVC Prediction for Medical Image Compression
Medical imaging technology and applications are continuously evolving, dealing with images
of increasing spatial and temporal resolutions, which allow easier and more accurate
medical diagnosis. However, this increase in resolution demands a growing amount of
data to be stored and transmitted. Despite the high coding efficiency achieved by the
most recent image and video coding standards in lossy compression, they are not well
suited for quality-critical medical image compression where either near-lossless or lossless
coding is required.
In this dissertation, two different approaches to improve lossless coding of volumetric
medical images, such as Magnetic Resonance and Computed Tomography, were studied
and implemented using the latest standard High Efficiency Video Encoder (HEVC). In a
first approach, the use of geometric transformations to perform inter-slice prediction was
investigated.
For the second approach, a pixel-wise prediction technique, based on Least-Squares prediction,
that exploits inter-slice redundancy was proposed to extend the current HEVC
lossless tools. Experimental results show a bitrate reduction between 45% and 49%, when
compared with DICOM recommended encoders, and 13.7% when compared with standard
HEVC
Enhanced Inter-Prediction Via Shifting Transformation in the H.264/AVC
OA Monitor Exercis
Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos
The success of the Neural Radiance Fields (NeRFs) for modeling and free-view
rendering static objects has inspired numerous attempts on dynamic scenes.
Current techniques that utilize neural rendering for facilitating free-view
videos (FVVs) are restricted to either offline rendering or are capable of
processing only brief sequences with minimal motion. In this paper, we present
a novel technique, Residual Radiance Field or ReRF, as a highly compact neural
representation to achieve real-time FVV rendering on long-duration dynamic
scenes. ReRF explicitly models the residual information between adjacent
timestamps in the spatial-temporal feature space, with a global
coordinate-based tiny MLP as the feature decoder. Specifically, ReRF employs a
compact motion grid along with a residual feature grid to exploit inter-frame
feature similarities. We show such a strategy can handle large motions without
sacrificing quality. We further present a sequential training scheme to
maintain the smoothness and the sparsity of the motion/residual grids. Based on
ReRF, we design a special FVV codec that achieves three orders of magnitudes
compression rate and provides a companion ReRF player to support online
streaming of long-duration FVVs of dynamic scenes. Extensive experiments
demonstrate the effectiveness of ReRF for compactly representing dynamic
radiance fields, enabling an unprecedented free-viewpoint viewing experience in
speed and quality.Comment: Accepted by CVPR 2023. Project page, see
https://aoliao12138.github.io/ReRF
Improving minimum rate predictors algorithm for compression of volumetric medical images
Medical imaging technologies are experiencing a growth in terms of usage and image
resolution, namely in diagnostics systems that require a large set of images, like CT or
MRI. Furthermore, legal restrictions impose that these scans must be archived for several
years. These facts led to the increase of storage costs in medical image databases and
institutions. Thus, a demand for more efficient compression tools, used for archiving and
communication, is arising.
Currently, the DICOM standard, that makes recommendations for medical communications
and imaging compression, recommends lossless encoders such as JPEG, RLE,
JPEG-LS and JPEG2000. However, none of these encoders include inter-slice prediction
in their algorithms.
This dissertation presents the research work on medical image compression, using the
MRP encoder. MRP is one of the most efficient lossless image compression algorithm.
Several processing techniques are proposed to adapt the input medical images to the
encoder characteristics. Two of these techniques, namely changing the alignment of slices
for compression and a pixel-wise difference predictor, increased the compression efficiency
of MRP, by up to 27.9%.
Inter-slice prediction support was also added to MRP, using uni and bi-directional techniques.
Also, the pixel-wise difference predictor was added to the algorithm. Overall, the
compression efficiency of MRP was improved by 46.1%. Thus, these techniques allow for
compression ratio savings of 57.1%, compared to DICOM encoders, and 33.2%, compared
to HEVC RExt Random Access. This makes MRP the most efficient of the encoders
under study
Advanced heterogeneous video transcoding
PhDVideo transcoding is an essential tool to promote inter-operability
between different video communication systems. This thesis presents
two novel video transcoders, both operating on bitstreams of the cur-
rent H.264/AVC standard. The first transcoder converts H.264/AVC
bitstreams to a Wavelet Scalable Video Codec (W-SVC), while the second targets the emerging High Efficiency Video Coding (HEVC).
Scalable Video Coding (SVC) enables low complexity adaptation
of compressed video, providing an efficient solution for content delivery
through heterogeneous networks. The transcoder proposed here aims at
exploiting the advantages offered by SVC technology when dealing with
conventional coders and legacy video, efficiently reusing information
found in the H.264/AVC bitstream to achieve a high rate-distortion
performance at a low complexity cost. Its main features include new
mode mapping algorithms that exploit the W-SVC larger macroblock
sizes, and a new state-of-the-art motion vector composition algorithm
that is able to tackle different coding configurations in the H.264/AVC
bitstream, including IPP or IBBP with multiple reference frames.
The emerging video coding standard, HEVC, is currently approaching the final stage of development prior to standardization. This thesis
proposes and evaluates several transcoding algorithms for the HEVC
codec. In particular, a transcoder based on a new method that is capable of complexity scalability, trading off rate-distortion performance
for complexity reduction, is proposed. Furthermore, other transcoding solutions are explored, based on a novel content-based modeling
approach, in which the transcoder adapts its parameters based on the
contents of the sequence being encoded.
Finally, the application of this research is not constrained to these
transcoders, as many of the techniques developed aim to contribute
to advance the research on this field, and have the potential to be
incorporated in different video transcoding architectures
Novi algoritam za kompresiju seizmičkih podataka velike amplitudske rezolucije
Renewable sources cannot meet energy demand of a growing global market. Therefore, it is expected that oil & gas will remain a substantial sources of energy in a coming years. To find a new oil & gas deposits that would satisfy growing global energy demands, significant efforts are constantly involved in finding ways to increase efficiency of a seismic surveys. It is commonly considered that, in an initial phase of exploration and production of a new fields, high-resolution and high-quality images of the subsurface are of the great importance. As one part in the seismic data processing chain, efficient managing and delivering of a large data sets, that are vastly produced by the industry during seismic surveys, becomes extremely important in order to facilitate further seismic data processing and interpretation. In this respect, efficiency to a large extent relies on the efficiency of the compression scheme, which is often required to enable faster transfer and access to data, as well as efficient data storage. Motivated by the superior performance of High Efficiency Video Coding (HEVC), and driven by the rapid growth in data volume produced by seismic surveys, this work explores a 32 bits per pixel (b/p) extension of the HEVC codec for compression of seismic data. It is proposed to reassemble seismic slices in a format that corresponds to video signal and benefit from the coding gain achieved by HEVC inter mode, besides the possible advantages of the (still image) HEVC intra mode. To this end, this work modifies almost all components of the original HEVC codec to cater for high bit-depth coding of seismic data: Lagrange multiplier used in optimization of the coding parameters has been adapted to the new data statistics, core transform and quantization have been reimplemented to handle the increased bit-depth range, and modified adaptive binary arithmetic coder has been employed for efficient entropy coding. In addition, optimized block selection, reduced intra prediction modes, and flexible motion estimation are tested to adapt to the structure of seismic data. Even though the new codec after implementation of the proposed modifications goes beyond the standardized HEVC, it still maintains a generic HEVC structure, and it is developed under the general HEVC framework. There is no similar work in the field of the seismic data compression that uses the HEVC as a base codec setting. Thus, a specific codec design has been tailored which, when compared to the JPEG-XR and commercial wavelet-based codec, significantly improves the peak-signal-tonoise- ratio (PSNR) vs. compression ratio performance for 32 b/p seismic data. Depending on a proposed configurations, PSNR gain goes from 3.39 dB up to 9.48 dB. Also, relying on the specific characteristics of seismic data, an optimized encoder is proposed in this work. It reduces encoding time by 67.17% for All-I configuration on trace image dataset, and 67.39% for All-I, 97.96% for P2-configuration and 98.64% for B-configuration on 3D wavefield dataset, with negligible coding performance losses. As a side contribution of this work, HEVC is analyzed within all of its functional units, so that the presented work itself can serve as a specific overview of methods incorporated into the standard
Dense light field coding: a survey
Light Field (LF) imaging is a promising solution for providing more immersive and closer to reality multimedia experiences to end-users with unprecedented creative freedom and flexibility for applications in different areas, such as virtual and augmented reality. Due to the recent technological advances in optics, sensor manufacturing and available transmission bandwidth, as well as the investment of many tech giants in this area, it is expected that soon many LF transmission systems will be available to both consumers and professionals. Recognizing this, novel standardization initiatives have recently emerged in both the Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG), triggering the discussion on the deployment of LF coding solutions to efficiently handle the massive amount of data involved in such systems.
Since then, the topic of LF content coding has become a booming research area, attracting the attention of many researchers worldwide. In this context, this paper provides a comprehensive survey of the most relevant LF coding solutions proposed in the literature, focusing on angularly dense LFs. Special attention is placed on a thorough description of the different LF coding methods and on the main concepts related to this relevant area. Moreover, comprehensive insights are presented into open research challenges and future research directions for LF coding.info:eu-repo/semantics/publishedVersio