4 research outputs found

    DCT-based Image/Video Compression: New Design Perspectives

    Get PDF
    To push the envelope of DCT-based lossy image/video compression, this thesis is motivated to revisit design of some fundamental blocks in image/video coding, ranging from source modelling, quantization table, quantizers, to entropy coding. Firstly, to better handle the heavy tail phenomenon commonly seen in DCT coefficients, a new model dubbed transparent composite model (TCM) is developed and justified. Given a sequence of DCT coefficients, the TCM first separates the tail from the main body of the sequence, and then uses a uniform distribution to model DCT coefficients in the heavy tail, while using a parametric distribution to model DCT coefficients in the main body. The separation boundary and other distribution parameters are estimated online via maximum likelihood (ML) estimation. Efficient online algorithms are proposed for parameter estimation and their convergence is also proved. When the parametric distribution is truncated Laplacian, the resulting TCM dubbed Laplacian TCM (LPTCM) not only achieves superior modeling accuracy with low estimation complexity, but also has a good capability of nonlinear data reduction by identifying and separating a DCT coefficient in the heavy tail (referred to as an outlier) from a DCT coefficient in the main body (referred to as an inlier). This in turn opens up opportunities for it to be used in DCT-based image compression. Secondly, quantization table design is revisited for image/video coding where soft decision quantization (SDQ) is considered. Unlike conventional approaches where quantization table design is bundled with a specific encoding method, we assume optimal SDQ encoding and design a quantization table for the purpose of reconstruction. Under this assumption, we model transform coefficients across different frequencies as independently distributed random sources and apply the Shannon lower bound to approximate the rate distortion function of each source. We then show that a quantization table can be optimized in a way that the resulting distortion complies with certain behavior, yielding the so-called optimal distortion profile scheme (OptD). Guided by this new theoretical result, we present an efficient statistical-model-based algorithm using the Laplacian model to design quantization tables for DCT-based image compression. When applied to standard JPEG encoding, it provides more than 1.5 dB performance gain (in PSNR), with almost no extra burden on complexity. Compared with the state-of-the-art JPEG quantization table optimizer, the proposed algorithm offers an average 0.5 dB gain with computational complexity reduced by a factor of more than 2000 when SDQ is off, and a 0.1 dB performance gain or more with 85% of the complexity reduced when SDQ is on. Thirdly, based on the LPTCM and OptD, we further propose an efficient non-predictive DCT-based image compression system, where the quantizers and entropy coding are completely re-designed, and the relative SDQ algorithm is also developed. The proposed system achieves overall coding results that are among the best and similar to those of H.264 or HEVC intra (predictive) coding, in terms of rate vs visual quality. On the other hand, in terms of rate vs objective quality, it significantly outperforms baseline JPEG by more than 4.3 dB on average, with a moderate increase on complexity, and ECEB, the state-of-the-art non-predictive image coding, by 0.75 dB when SDQ is off, with the same level of computational complexity, and by 1 dB when SDQ is on, at the cost of extra complexity. In comparison with H.264 intra coding, our system provides an overall 0.4 dB gain or so, with dramatically reduced computational complexity. It offers comparable or even better coding performance than HEVC intra coding in the high-rate region or for complicated images, but with only less than 5% of the encoding complexity of the latter. In addition, our proposed DCT-based image compression system also offers a multiresolution capability, which, together with its comparatively high coding efficiency and low complexity, makes it a good alternative for real-time image processing applications

    Joint Compression and Watermarking Using Variable-Rate Quantization and its Applications to JPEG

    Get PDF
    In digital watermarking, one embeds a watermark into a covertext, in such a way that the resulting watermarked signal is robust to a certain distortion caused by either standard data processing in a friendly environment or malicious attacks in an unfriendly environment. In addition to the robustness, there are two other conflicting requirements a good watermarking system should meet: one is referred as perceptual quality, that is, the distortion incurred to the original signal should be small; and the other is payload, the amount of information embedded (embedding rate) should be as high as possible. To a large extent, digital watermarking is a science and/or art aiming to design watermarking systems meeting these three conflicting requirements. As watermarked signals are highly desired to be compressed in real world applications, we have looked into the design and analysis of joint watermarking and compression (JWC) systems to achieve efficient tradeoffs among the embedding rate, compression rate, distortion and robustness. Using variable-rate scalar quantization, an optimum encoding and decoding scheme for JWC systems is designed and analyzed to maximize the robustness in the presence of additive Gaussian attacks under constraints on both compression distortion and composite rate. Simulation results show that in comparison with the previous work of designing JWC systems using fixed-rate scalar quantization, optimum JWC systems using variable-rate scalar quantization can achieve better performance in the distortion-to-noise ratio region of practical interest. Inspired by the good performance of JWC systems, we then investigate its applications in image compression. We look into the design of a joint image compression and blind watermarking system to maximize the compression rate-distortion performance while maintaining baseline JPEG decoder compatibility and satisfying the additional constraints imposed by watermarking. Two watermarking embedding schemes, odd-even watermarking (OEW) and zero-nonzero watermarking (ZNW), have been proposed for the robustness to a class of standard JPEG recompression attacks. To maximize the compression performance, two corresponding alternating algorithms have been developed to jointly optimize run-length coding, Huffman coding and quantization table selection subject to the additional constraints imposed by OEW and ZNW respectively. Both of two algorithms have been demonstrated to have better compression performance than the DQW and DEW algorithms developed in the recent literature. Compared with OEW scheme, the ZNW embedding method sacrifices some payload but earns more robustness against other types of attacks. In particular, the zero-nonzero watermarking scheme can survive a class of valumetric distortion attacks including additive noise, amplitude changes and recompression for everyday usage

    Combined Industry, Space and Earth Science Data Compression Workshop

    Get PDF
    The sixth annual Space and Earth Science Data Compression Workshop and the third annual Data Compression Industry Workshop were held as a single combined workshop. The workshop was held April 4, 1996 in Snowbird, Utah in conjunction with the 1996 IEEE Data Compression Conference, which was held at the same location March 31 - April 3, 1996. The Space and Earth Science Data Compression sessions seek to explore opportunities for data compression to enhance the collection, analysis, and retrieval of space and earth science data. Of particular interest is data compression research that is integrated into, or has the potential to be integrated into, a particular space or earth science data information system. Preference is given to data compression research that takes into account the scien- tist's data requirements, and the constraints imposed by the data collection, transmission, distribution and archival systems

    Compression, pose tracking, and halftoning

    Get PDF
    In this thesis, we discuss image compression, pose tracking, and halftoning. Although these areas seem to be unrelated at first glance, they can be connected through video coding as application scenario. Our first contribution is an image compression algorithm based on a rectangular subdivision scheme which stores only a small subsets of the image points. From these points, the remained of the image is reconstructed using partial differential equations. Afterwards, we present a pose tracking algorithm that is able to follow the 3-D position and orientation of multiple objects simultaneously. The algorithm can deal with noisy sequences, and naturally handles both occlusions between different objects, as well as occlusions occurring in kinematic chains. Our third contribution is a halftoning algorithm based on electrostatic principles, which can easily be adjusted to different settings through a number of extensions. Examples include modifications to handle varying dot sizes or hatching. In the final part of the thesis, we show how to combine our image compression, pose tracking, and halftoning algorithms to novel video compression codecs. In each of these four topics, our algorithms yield excellent results that outperform those of other state-of-the-art algorithms.In dieser Arbeit werden die auf den ersten Blick vollkommen voneinander unabhängig erscheinenden Bereiche Bildkompression, 3D-Posenschätzung und Halbtonverfahren behandelt und im Bereich der Videokompression sinnvoll zusammengeführt. Unser erster Beitrag ist ein Bildkompressionsalgorithmus, der auf einem rechteckigen Unterteilungsschema basiert. Dieser Algorithmus speichert nur eine kleine Teilmenge der im Bild vorhandenen Punkte, während die restlichen Punkte mittels partieller Differentialgleichungen rekonstruiert werden. Danach stellen wir ein Posenschätzverfahren vor, welches die 3D-Position und Ausrichtung von mehreren Objekten anhand von Bilddaten gleichzeitig verfolgen kann. Unser Verfahren funktioniert bei verrauschten Videos und im Falle von Objektüberlagerungen. Auch Verdeckungen innerhalb einer kinematischen Kette werden natürlich behandelt. Unser dritter Beitrag ist ein Halbtonverfahren, das auf elektrostatischen Prinzipien beruht. Durch eine Reihe von Erweiterungen kann dieses Verfahren flexibel an verschiedene Szenarien angepasst werden. So ist es beispielsweise möglich, verschiedene Punktgrößen zu verwenden oder Schraffuren zu erzeugen. Der letzte Teil der Arbeit zeigt, wie man unseren Bildkompressionsalgorithmus, unser Posenschätzverfahren und unser Halbtonverfahren zu neuen Videokompressionsalgorithmen kombinieren kann. Die für jeden der vier Themenbereiche entwickelten Verfahren erzielen hervorragende Resultate, welche die Ergebnisse anderer moderner Verfahren übertreffen
    corecore