212 research outputs found

    MAP Joint Source-Channel Arithmetic Decoding for Compressed Video

    Get PDF
    In order to have robust video transmission over error prone telecommunication channels several mechanisms are introduced. These mechanisms try to detect, correct or conceal the errors in the received video stream. In this thesis, the performance of the video codec is improved in terms of error rates without increasing overhead in terms of data bit rate. This is done by exploiting the residual syntactic/semantic redundancy inside compressed video along with optimizing the configuration of the state-of-the art entropy coding, i.e., binary arithmetic coding, and optimizing the quantization of the channel output. The thesis is divided into four phases. In the first phase, a breadth-first suboptimal sequential maximum a posteriori (MAP) decoder is employed for joint source-channel arithmetic decoding of H.264 symbols. The proposed decoder uses not only the intentional redundancy inserted via a forbidden symbol (FS) but also exploits residual redundancy by a syntax checker. In contrast to previous methods this is done as each channel bit is decoded. Simulations using intra prediction modes show improvements in error rates, e.g., syntax element error rate reduction by an order of magnitude for channel SNR of 7.33dB. The cost of this improvement is more computational complexity spent on the syntax checking. In the second phase, the configuration of the FS in the symbol set is studied. The delay probability function, i.e., the probability of the number of bits required to detect an error, is calculated for various FS configurations. The probability of missed error detection is calculated as a figure of merit for optimizing the FS configuration. The simulation results show the effectiveness of the proposed figure of merit, and support the FS configuration in which the FS lies entirely between the other information carrying symbols to be the best. In the third phase, a new method for estimating the a priori probability of particular syntax elements is proposed. This estimation is based on the interdependency among the syntax elements that were previously decoded. This estimation is categorized as either reliable or unreliable. The decoder uses this prior information when they are reliable, otherwise the MAP decoder considers that the syntax elements are equiprobable and in turn uses maximum likelihood (ML) decoding. The reliability detection is carried out using a threshold on the local entropy of syntax elements in the neighboring macroblocks. In the last phase, a new measure to assess performance of the channel quantizer is proposed. This measure is based on the statistics of the rank of true candidate among the sorted list of candidates in the MAP decoder. Simulation results shows that a quantizer designed based on the proposed measure is superior to the quantizers designed based on maximum mutual information and minimum mean square error

    Error resilience and concealment techniques for high-efficiency video coding

    Get PDF
    This thesis investigates the problem of robust coding and error concealment in High Efficiency Video Coding (HEVC). After a review of the current state of the art, a simulation study about error robustness, revealed that the HEVC has weak protection against network losses with significant impact on video quality degradation. Based on this evidence, the first contribution of this work is a new method to reduce the temporal dependencies between motion vectors, by improving the decoded video quality without compromising the compression efficiency. The second contribution of this thesis is a two-stage approach for reducing the mismatch of temporal predictions in case of video streams received with errors or lost data. At the encoding stage, the reference pictures are dynamically distributed based on a constrained Lagrangian rate-distortion optimization to reduce the number of predictions from a single reference. At the streaming stage, a prioritization algorithm, based on spatial dependencies, selects a reduced set of motion vectors to be transmitted, as side information, to reduce mismatched motion predictions at the decoder. The problem of error concealment-aware video coding is also investigated to enhance the overall error robustness. A new approach based on scalable coding and optimally error concealment selection is proposed, where the optimal error concealment modes are found by simulating transmission losses, followed by a saliency-weighted optimisation. Moreover, recovery residual information is encoded using a rate-controlled enhancement layer. Both are transmitted to the decoder to be used in case of data loss. Finally, an adaptive error resilience scheme is proposed to dynamically predict the video stream that achieves the highest decoded quality for a particular loss case. A neural network selects among the various video streams, encoded with different levels of compression efficiency and error protection, based on information from the video signal, the coded stream and the transmission network. Overall, the new robust video coding methods investigated in this thesis yield consistent quality gains in comparison with other existing methods and also the ones implemented in the HEVC reference software. Furthermore, the trade-off between coding efficiency and error robustness is also better in the proposed methods

    Mode decision for the H.264/AVC video coding standard

    Get PDF
    H.264/AVC video coding standard gives us a very promising future for the field of video broadcasting and communication because of its high coding efficiency compared with other older video coding standards. However, high coding efficiency also carries high computational complexity. Fast motion estimation and fast mode decision are two very useful techniques which can significantly reduce computational complexity. This thesis focuses on the field of fast mode decision. The goal of this thesis is that for very similar RD performance compared with H.264/AVC video coding standard, we aim to find new fast mode decision techniques which can afford significant time savings. [Continues.

    3D Medical Image Lossless Compressor Using Deep Learning Approaches

    Get PDF
    The ever-increasing importance of accelerated information processing, communica-tion, and storing are major requirements within the big-data era revolution. With the extensive rise in data availability, handy information acquisition, and growing data rate, a critical challenge emerges in efficient handling. Even with advanced technical hardware developments and multiple Graphics Processing Units (GPUs) availability, this demand is still highly promoted to utilise these technologies effectively. Health-care systems are one of the domains yielding explosive data growth. Especially when considering their modern scanners abilities, which annually produce higher-resolution and more densely sampled medical images, with increasing requirements for massive storage capacity. The bottleneck in data transmission and storage would essentially be handled with an effective compression method. Since medical information is critical and imposes an influential role in diagnosis accuracy, it is strongly encouraged to guarantee exact reconstruction with no loss in quality, which is the main objective of any lossless compression algorithm. Given the revolutionary impact of Deep Learning (DL) methods in solving many tasks while achieving the state of the art results, includ-ing data compression, this opens tremendous opportunities for contributions. While considerable efforts have been made to address lossy performance using learning-based approaches, less attention was paid to address lossless compression. This PhD thesis investigates and proposes novel learning-based approaches for compressing 3D medical images losslessly.Firstly, we formulate the lossless compression task as a supervised sequential prediction problem, whereby a model learns a projection function to predict a target voxel given sequence of samples from its spatially surrounding voxels. Using such 3D local sampling information efficiently exploits spatial similarities and redundancies in a volumetric medical context by utilising such a prediction paradigm. The proposed NN-based data predictor is trained to minimise the differences with the original data values while the residual errors are encoded using arithmetic coding to allow lossless reconstruction.Following this, we explore the effectiveness of Recurrent Neural Networks (RNNs) as a 3D predictor for learning the mapping function from the spatial medical domain (16 bit-depths). We analyse Long Short-Term Memory (LSTM) models’ generalisabil-ity and robustness in capturing the 3D spatial dependencies of a voxel’s neighbourhood while utilising samples taken from various scanning settings. We evaluate our proposed MedZip models in compressing unseen Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities losslessly, compared to other state-of-the-art lossless compression standards.This work investigates input configurations and sampling schemes for a many-to-one sequence prediction model, specifically for compressing 3D medical images (16 bit-depths) losslessly. The main objective is to determine the optimal practice for enabling the proposed LSTM model to achieve a high compression ratio and fast encoding-decoding performance. A solution for a non-deterministic environments problem was also proposed, allowing models to run in parallel form without much compression performance drop. Compared to well-known lossless codecs, experimental evaluations were carried out on datasets acquired by different hospitals, representing different body segments, and have distinct scanning modalities (i.e. CT and MRI).To conclude, we present a novel data-driven sampling scheme utilising weighted gradient scores for training LSTM prediction-based models. The objective is to determine whether some training samples are significantly more informative than others, specifically in medical domains where samples are available on a scale of billions. The effectiveness of models trained on the presented importance sampling scheme was evaluated compared to alternative strategies such as uniform, Gaussian, and sliced-based sampling

    Visual and Geometric Data Compression for Immersive Technologies

    Get PDF
    The contributions of this thesis are new compression algorithms for light field images and point cloud geometry. Light field imaging attracted wide attention in the recent decade, partly due to emergence of relatively low-cost handheld light field cameras designed for commercial purposes whereas point clouds are used more and more frequently in immersive technologies, replacing other forms of 3D representation. We obtain successful coding performance by combining conventional image processing methods, entropy coding, learning-based disparity estimation and optimization of neural networks for context probability modeling. On the light field coding side, we develop a lossless light field coding method which uses learning-based disparity estimations to predict any view in a light field from a set of reference views. On the point cloud geometry compression side, we develop four different algorithms. The first two of these algorithms follow the so-called bounding volumes approach which initially represents a part of the point cloud in two depth maps where the remaining points of the cloud are contained in a bounding volume which can be derived using only the two depth maps that are losslessly transmitted. One of the two algorithms is a lossy coder that reconstructs some of the remaining points in several steps which involve conventional image processing and image coding techniques. The other one is a lossless coder which applies a novel context arithmetic coding approach involving gradual expansion of the reconstructed point cloud into neighboring voxels. The last two of the proposed point cloud compression algorithms use neural networks for context probability modeling for coding the octree representation of point clouds using arithmetic coding. One of these two algorithms is a learning-based intra-frame coder which requires an initial training stage on a set of training point clouds. The lastly presented algorithm is an inter-frame (sequence) encoder which incorporates the neural network training into the encoding stage, thus for each sequence of point clouds, a specific neural network model is optimized which is also transmitted as a header in the bitstream

    Proceedings of the Eighth Workshop on Information Theoretic Methods in Science and Engineering

    Get PDF
    Proceedings of the Eighth Workshop on Information Theoretic Methods in Science and Engineering (WITMSE 2015) held in Copenhagen, Denmark, 24-26 June 2015; published in the series of the Department of Computer Science, University of Helsinki.Peer reviewe

    Remote Sensing Data Compression

    Get PDF
    A huge amount of data is acquired nowadays by different remote sensing systems installed on satellites, aircrafts, and UAV. The acquired data then have to be transferred to image processing centres, stored and/or delivered to customers. In restricted scenarios, data compression is strongly desired or necessary. A wide diversity of coding methods can be used, depending on the requirements and their priority. In addition, the types and properties of images differ a lot, thus, practical implementation aspects have to be taken into account. The Special Issue paper collection taken as basis of this book touches on all of the aforementioned items to some degree, giving the reader an opportunity to learn about recent developments and research directions in the field of image compression. In particular, lossless and near-lossless compression of multi- and hyperspectral images still remains current, since such images constitute data arrays that are of extremely large size with rich information that can be retrieved from them for various applications. Another important aspect is the impact of lossless compression on image classification and segmentation, where a reasonable compromise between the characteristics of compression and the final tasks of data processing has to be achieved. The problems of data transition from UAV-based acquisition platforms, as well as the use of FPGA and neural networks, have become very important. Finally, attempts to apply compressive sensing approaches in remote sensing image processing with positive outcomes are observed. We hope that readers will find our book useful and interestin

    Attention Driven Solutions for Robust Digital Watermarking Within Media

    Get PDF
    As digital technologies have dramatically expanded within the last decade, content recognition now plays a major role within the control of media. Of the current recent systems available, digital watermarking provides a robust maintainable solution to enhance media security. The two main properties of digital watermarking, imperceptibility and robustness, are complimentary to each other but by employing visual attention based mechanisms within the watermarking framework, highly robust watermarking solutions are obtainable while also maintaining high media quality. This thesis firstly provides suitable bottom-up saliency models for raw image and video. The image and video saliency algorithms are estimated directly from within the wavelet domain for enhanced compatibility with the watermarking framework. By combining colour, orientation and intensity contrasts for the image model and globally compensated object motion in the video model, novel wavelet-based visual saliency algorithms are provided. The work extends these saliency models into a unique visual attention-based watermarking scheme by increasing the watermark weighting parameter within visually uninteresting regions. An increased watermark robustness, up to 40%, against various filtering attacks, JPEG2000 and H.264/AVC compression is obtained while maintaining the media quality, verified by various objective and subjective evaluation tools. As most video sequences are stored in an encoded format, this thesis studies watermarking schemes within the compressed domain. Firstly, the work provides a compressed domain saliency model formulated directly within the HEVC codec, utilizing various coding decisions such as block partition size, residual magnitude, intra frame angular prediction mode and motion vector difference magnitude. Large computational savings, of 50% or greater, are obtained compared with existing methodologies, as the saliency maps are generated from partially decoded bitstreams. Finally, the saliency maps formulated within the compressed HEVC domain are studied within the watermarking framework. A joint encoder and a frame domain watermarking scheme are both proposed by embedding data into the quantised transform residual data or wavelet coefficients, respectively, which exhibit low visual salience

    The Fifth NASA Symposium on VLSI Design

    Get PDF
    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design

    Design of large polyphase filters in the Quadratic Residue Number System

    Full text link
    corecore