1,200 research outputs found

    Perceptually-Driven Video Coding with the Daala Video Codec

    Full text link
    The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201

    A practical approach for the design of nonuniform lapped transforms

    Get PDF
    We propose a simple method for the design of lapped transforms with nonuniform frequency resolution and good time localization. The method is a generalization of an approach previously proposed by Princen, where the nonuniform filter bank is obtained by joining uniform cosine-modulated filter banks (CMFBs) using a transition filter. We use several transition filters to obtain a near perfect-reconstruction (PR) nonuniform lapped transform with significantly reduced overall distortion. The main advantage of the proposed method is in reducing the length of the transition filters, which leads to a reduction in processing delay that can be useful for applications such as real-time audio coding

    Role of anticausal inverses in multirate filter-banks. I. System-theoretic fundamentals

    Get PDF
    In a maximally decimated filter bank with identical decimation ratios for all channels, the perfect reconstructibility property and the nature of reconstruction filters (causality, stability, FIR property, and so on) depend on the properties of the polyphase matrix. Various properties and capabilities of the filter bank depend on the properties of the polyphase matrix as well as the nature of its inverse. In this paper we undertake a study of the types of inverses and characterize them according to their system theoretic properties (i.e., properties of state-space descriptions, McMillan degree, degree of determinant, and so forth). We find in particular that causal polyphase matrices with anticausal inverses have an important role in filter bank theory. We study their properties both for the FIR and IIR cases. Techniques for implementing anticausal IIR inverses based on state space descriptions are outlined. It is found that causal FIR matrices with anticausal FIR inverses (cafacafi) have a key role in the characterization of FIR filter banks. In a companion paper, these results are applied for the factorization of biorthogonal FIR filter banks, and a generalization of the lapped orthogonal transform called the biorthogonal lapped transform (BOLT) developed

    VHDL modeling and synthesis of the JPEG-XR inverse transform

    Get PDF
    This work presents a pipelined VHDL implementation of the inverse lapped biorthogonal transform used in the decompression process of the soon to be released JPEG-XR still image standard format. This inverse transform involves integer only calculations using lifting operations and Kronecker products. Divisions and multiplications by small integer coefficients are implemented using a bit shift and add technique resulting in a multiplier-less implementation with 736 instances of addition. When targeted to an Altera Stratix II FPGA with a 50 MHz system clock, this design is capable of completing the inverse transform of an 8400 x 6600 pixel image in less than 70 ms

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

    Compression methods for mechanical vibration signals: Application to the plane engines

    No full text
    International audienceA novel approach for the compression of mechanical vibration signals is presented in this paper. The method relies on a simple and flexible decomposition into a large number of subbands, implemented by an orthogonal transform. Compression is achieved by a uniform adaptive quantization of each subband. The method is tested on a large number of real vibration signals issued by plane engines. High compression ratios can be achieved, while keeping a good quality of the reconstructed signal. It is also shown that compression has little impact on the detection of some commonly encountered defects of the plane engine

    Pipelined implementation of Jpeg image compression using Hdl

    Full text link
    This thesis presents the architecture and design of a JPEG compressor for color images using VHDL. The system consists of major parts like color space converter, down sampler, 2-D DCT module, quantization, zigzag scanning and entropy coDing The color space conversion transforms the RGB colors to YCbCr color coDing The down sampling operation reduces the sampling rate of the color information (Cb and Cr). The 2-D DCT transform the pixel data from the spatial domain to the frequency domain. The quantization operation eliminates the high frequency components and the small amplitude coefficients of the co-sine expansion. Finally, the entropy coding uses run-length encoding (RLE), Huffman, variable length coding (VLC) and differential coding to decrease the number of bits used to represent the image. The JPEG compression is a lossy compression, since downsampling and quantization operations are irreversible. But the losses can be controlled in order to keep the necessary image quality; Architectures for these parts were designed and described in VHDL. The results were observed using Active-HDL simulator and the code being synthesized using xilinx ise for vertex-4 FPGA. This pipelined architecture has a minimum latency of 187 clock cycles

    Transform/subband analysis and synthesis of signals

    Get PDF
    Includes bibliographical references (leaves 35-37).Work supported by the Advanced Television Research Program. Work supported by the National Science Foundation. MIP 87-14969David M. Baylon and Jae S. Lim
    • …
    corecore