152 research outputs found

    Image Compression using Discrete Cosine Transform & Discrete Wavelet Transform

    Get PDF
    Image Compression addresses the problem of reducing the amount of data required to represent the digital image. Compression is achieved by the removal of one or more of three basic data redundancies: (1) Coding redundancy, which is present when less than optimal (i.e. the smallest length) code words are used; (2) Interpixel redundancy, which results from correlations between the pixels of an image & (3) psycho visual redundancy which is due to data that is ignored by the human visual system (i.e. visually nonessential information). Huffman codes contain the smallest possible number of code symbols (e.g., bits) per source symbol (e.g., grey level value) subject to the constraint that the source symbols are coded one at a time. So, Huffman coding when combined with technique of reducing the image redundancies using Discrete Cosine Transform (DCT) helps in compressing the image data to a very good extent. The Discrete Cosine Transform (DCT) is an example of transform coding. The current JPEG standard uses the DCT as its basis. The DC relocates the highest energies to the upper left corner of the image. The lesser energy or information is relocated into other areas. The DCT is fast. It can be quickly calculated and is best for images with smooth edges like photos with human subjects. The DCT coefficients are all real numbers unlike the Fourier Transform. The Inverse Discrete Cosine Transform (IDCT) can be used to retrieve the image from its transform representation. The Discrete wavelet transform (DWT) has gained widespread acceptance in signal processing and image compression. Because of their inherent multi-resolution nature, wavelet-coding schemes are especially suitable for applications where scalability and tolerable degradation are important. Recently the JPEG committee has released its new image coding standard, JPEG-2000, which has been based upon DWT

    Directional edge and texture representations for image processing

    Get PDF
    An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations

    Coded Aperture Hyperspectral Image Reconstruction

    Get PDF
    This article belongs to the Special Issue Computational Spectral Imaging[Abstract] In this work, we study and analyze the reconstruction of hyperspectral images that are sampled with a CASSI device. The sensing procedure was modeled with the help of the CS theory, which enabled efficient mechanisms for the reconstruction of the hyperspectral images from their compressive measurements. In particular, we considered and compared four different type of estimation algorithms: OMP, GPSR, LASSO, and IST. Furthermore, the large dimensions of hyperspectral images required the implementation of a practical block CASSI model to reconstruct the images with an acceptable delay and affordable computational cost. In order to consider the particularities of the block model and the dispersive effects in the CASSI-like sensing procedure, the problem was reformulated, as well as the construction of the variables involved. For this practical CASSI setup, we evaluated the performance of the overall system by considering the aforementioned algorithms and the different factors that impacted the reconstruction procedure. Finally, the obtained results were analyzed and discussed from a practical perspective.This work was funded by the Xunta de Galicia (by Grant ED431C 2020/15 and Grant ED431G 2019/01 to support the Centro de Investigación de Galicia “CITIC”), the Agencia Estatal de Investigación of Spain (by Grants RED2018-102668-T and PID2019-104958RB-C42), and the ERDF funds of the EU (FEDER Galicia 2014-2020 and AEI/FEDER Programs, UE).Xunta de Galicia; ED431C 2020/15Xunta de Galicia; ED431G 2019/0

    Biorthogonality in lapped transforms : a study in high-quality audio compression

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (leaves 76-82).by Shiufun Cheung.Ph.D

    Contributions in image and video coding

    Get PDF
    Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A comunidade de codificação de imagens e vídeo vem também trabalhando em inovações que vão além das tradicionais técnicas de codificação de imagens e vídeo. Este trabalho é um conjunto de contribuições a vários tópicos que têm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalável, codificação de baixa complexidade para dispositivos móveis, codificação de vídeo de múltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de três transformadas 3-D rápidas por blocos em um codificador de vídeo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos métodos de implementação e ordens de varredura são propostos para as transformadas. Os coeficiente 3-D são codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saída totalmente embutida. Todas as implementações são feitas usando arquitetura com aritmética inteira de 16 bits. Somente adições e deslocamentos de bits são necessários, o que reduz a complexidade computacional. Mesmo com essas restrições, um bom desempenho em termos de taxa de bits versus distorção pôde ser obtido e os tempos de codificação são significativamente menores (em torno de 160 vezes) quando comparados ao padrão H.264/AVC. A segunda contribuição é a otimização de uma recente abordagem proposta para codificação de vídeo de múltiplas vistas em aplicações de video-conferência e outras aplicações do tipo "unicast" similares. O cenário alvo nessa abordagem é fornecer vídeo com percepção real em 3-D e ponto de vista livre a boas taxas de compressão. Para atingir tal objetivo, pesos são atribuídos a cada vista e mapeados em parâmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parâmetros de quantização é mostrado ser quase-ótimo para uma fonte Gaussiana e um mapeamento ótimo é derivado para fonte típicas de vídeo. A terceira contribuição explora várias estratégias para varredura adaptativa dos coeficientes da transformada no padrão JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR é comparada com os métodos de varredura localizados e híbridos propostos neste trabalho. Essas novas ordens não requerem mudanças nem nos outros estágios de codificação e decodificação, nem na definição da bitstream A quarta e última contribuição propõe uma transformada por blocos dependente do sinal. As transformadas hierárquicas usualmente exploram a informação residual entre os níveis no estágio da codificação de entropia, mas não no estágio da transformada. A transformada proposta neste trabalho é uma técnica de compactação de energia que também explora as similaridades estruturais entre os níveis de resolução. A idéia central da técnica é incluir na transformada hierárquica um número de funções de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos são discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétric

    Signal processing for high-definition television

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 1995.Includes bibliographical references (p. 60-62).by Peter Monta.Ph.D

    Efficient compression of motion compensated residuals

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Wavelets and multirate filter banks : theory, structure, design, and applications

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2004.Includes bibliographical references (p. 219-230) and index.Wavelets and filter banks have revolutionized signal processing with their ability to process data at multiple temporal and spatial resolutions. Fundamentally, continuous-time wavelets are governed by discrete-time filter banks with properties such as perfect reconstruction, linear phase and regularity. In this thesis, we study multi-channel filter bank factorization and parameterization strategies, which facilitate designs with specified properties that are enforced by the actual factorization structure. For M-channel filter banks (M =/> 2), we develop a complete factorization, M-channel lifting factorization, using simple ladder-like structures as predictions between channels to provide robust and efficient implementation; perfect reconstruction is structurally enforced, even under finite precision arithmetic and quantization of lifting coefficients. With lifting, optimal low-complexity integer wavelet transforms can thus be designed using a simple and fast algorithm that incorporates prescribed limits on hardware operations for power-constrained environments. As filter bank regularity is important for a variety of reasons, an aspect of particular interest is the structural imposition of regularity onto factorizations based on the dyadic form uvt. We derive the corresponding structural conditions for regularity, for which M-channel lifting factorization provides an essential parameterization. As a result, we are able to design filter banks that are exactly regular and amenable to fast implementations with perfect reconstruction, regardless of the choice of free parameters and possible finite precision effects. Further constraining u = v ensures regular orthogonal filter banks,(cont.) whereas a special dyadic form is developed that guarantees linear phase. We achieve superior coding gains within 0.1% of the optimum, and benchmarks conducted on image compression applications show clear improvements in perceptual and objective performance. We also consider the problem of completing an M-channel filter bank, given only its scaling filter. M-channel lifting factorization can efficiently complete such biorthogonal filter banks. On the other hand, an improved scheme for completing paraunitary filter banks is made possible by a novel order-one factorization which allows greater design flexibility, resulting in improved frequency selectivity and energy compaction over existing state of the art methods. In a dual setting, the technique can be applied to transmultiplexer design to achieve higher-rate data transmissions.by Ying-Jui Chen.Ph.D
    corecore