670 research outputs found

    On Sparse Coding as an Alternate Transform in Video Coding

    Get PDF
    In video compression, specifically in the prediction process, a residual signal is calculated by subtracting the predicted from the original signal, which represents the error of this process. This residual signal is usually transformed by a discrete cosine transform (DCT) from the pixel, into the frequency domain. It is then quantized, which filters more or less high frequencies (depending on a quality parameter). The quantized signal is then entropy encoded usually by a context-adaptive binary arithmetic coding engine (CABAC), and written into a bitstream. In the decoding phase the process is reversed. DCT and quantization in combination are efficient tools, but they are not performing well at lower bitrates and creates distortion and side effect. The proposed method uses sparse coding as an alternate transform which compresses well at lower bitrates, but not well at high bitrates. The decision which transform is used is based on a rate-distortion optimization (RDO) cost calculation to get both transforms in their optimal performance range. The proposed method is implemented in high efficient video coding (HEVC) test model HM-16.18 and high efficient video coding for screen content coding (HEVC-SCC) for test model HM-16.18+SCM-8.7, with a Bjontegaard rate difference (BD-rate) saving, which archives up to 5.5%, compared to the standard

    State of the art in 2D content representation and compression

    Get PDF
    Livrable D1.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.1 du projet

    Efficient compression of motion compensated residuals

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Subspace portfolios: design and performance comparison

    Get PDF
    Data processing and engineering techniques enable people to observe and better understand the natural and human-made systems and processes that generate huge amounts of various data types. Data engineers collect data created in almost all fields and formats, such as images, audio, and text streams, biological and financial signals, sensing and many others. They develop and implement state-of-the art machine learning (ML) and artificial intelligence (AI) algorithms using big data to infer valuable information with social and economic value. Furthermore, ML/AI methodologies lead to automate many decision making processes with real-time applications serving people and businesses. As an example, mathematical tools are engineered for analysis of financial data such as prices, trade volumes, and other economic indicators of instruments including stocks, options and futures in order to automate the generation, implementation and maintenance of investment portfolios. Among the techniques, subspace framework and methods are fundamental, and they have been successfully employed in widely used technologies and real-time applications spanning from Internet multimedia to electronic trading of financial products. In this dissertation, the eigendecomposition of empirical correlation matrix created from market data (normalized returns) for a basket of US equities plays a central role. Then, the merit of approximating such an empirical matrix by a Toeplitz matrix, where closed form solutions for its eigenvalues and eigenvectors exist, is investigated. More specifically, the exponential correlation model that populates such a Toeplitz matrix is used to approximate pairwise empirical correlations of asset returns in a portfolio. Hence, the analytically derived eigenvectors of such a random vector process are utilized to design its eigenportfolios. The performances of the model based and the traditional eigenportfolios are studied and compared to validate the proposed portfolio design method. It is shown that the model based designs yield eigenportfolios that track the variations of the market statistics closely and deliver comparable or better performance. The theoretical foundations of information theory and the rate-distortion theory that provide the basis for source coding methods, including transform coding, are revisited in the dissertation. This theoretical inquiry helps to construct the basic question of trade-offs between dimension of the eigensubspace versus the correlation structure of the random vector process it represents. The signal processing literature facilitates developing an efficient subspace partitioning algorithm to design novel portfolios by combining eigenportfolios of partitions for US equities that outperform the existing eigenportfolios (EP), market portfolios (MP), minimum variance portfolios (MVP), and hierarchical risk parity (HRP) portfolios for US equities. Additionally, the pdf-optimized quantizer framework is employed to sparse eigenportfolios in order to reduce the (trading) cost of their maintenance. Then, the concluding remarks are presented in the last section of the Dissertation

    Localized temporal decorrelation for video compression

    Get PDF
    Many of the current video compression algorithms perform analysis and coding operations in a block-wise manner. Most of them use a motion compensated DCT algorithm as the basis. Many other codecs, mostly academic and in their infancy and known as Second Generation techniques, utilize region and contour based and model based techniques. Unfortunately, these second-generation methods have not been successful in gaining widespread acceptance in both the standards and the consumer world. Many of them require specialized computationally intensive software and/or hardware. Due to these shortcomings, current block based methods have been finetuned to get better performance at even very low bit rates (sub 64 kbps). Block based motion estimation is the principal mechanism used to compensate for motion between frames in an image sequence. Although current algorithms are fast and quite effective, they fail in compensating for uncovered background areas in a frame. Solutions such as hierarchical motion estimation schemes do not work very well since there is no reference in past, and in some cases, future frames for an uncovered background resulting in the block being transmitted as an intra frame (which requires the most bandwidth among all type of blocks). This thesis intro duces an intermediate stage, which compensates for these isolated uncovered areas. The intermediate stage uses a localized decorrelation technique to reduce frame to frame temporal redundancies. The algorithm can be easily incorporated into exist ing systems to achieve an even better performance and can be easily extended as a scalable video coding architecture. Experimental results show that the algorithm, used in conjunction with motion estimation, is quite effective in reducing temporal redundancies

    Video coding in a broadcast environment

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1993.Includes bibliographical references (leaves 79-81).by Manuela Alexandra Trigo Miranda de Sousa Pereira.M.S

    Implementation of Vector Quantization for Image Compression - A Survey

    Get PDF
    This paper presents a survey on vector quantization for image compression. Moreover it provides a means of decomposition of the signal in an approach which takes the improvement of inter and intra band correlation as more lithe partition for higher dimension vector spaces. Thus, the image is compressed without information loss using artificial neural networks (ANN). Since 1988, a growing body of research has examined the use of VQ for the image compression. This paper discusses about vector quantization, its principle and examples, its various techniques and image compression its advantages and applications. Additionally this paper also provides a comparative table in the view of simplicity, storage space, robustness and transfer time of various vector quantization methods. In addition the proposed paper also presents a survey on different methods of vector quantization for image compression

    Scalable video compression with optimized visual performance and random accessibility

    Full text link
    This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved. The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling. The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field. The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate. For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video
    • …
    corecore