41 research outputs found

    Contributions in image and video coding

    Get PDF
    Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia ElĂ©trica e de ComputaçãoResumo: A comunidade de codificação de imagens e vĂ­deo vem tambĂ©m trabalhando em inovaçÔes que vĂŁo alĂ©m das tradicionais tĂ©cnicas de codificação de imagens e vĂ­deo. Este trabalho Ă© um conjunto de contribuiçÔes a vĂĄrios tĂłpicos que tĂȘm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalĂĄvel, codificação de baixa complexidade para dispositivos mĂłveis, codificação de vĂ­deo de mĂșltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de trĂȘs transformadas 3-D rĂĄpidas por blocos em um codificador de vĂ­deo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos mĂ©todos de implementação e ordens de varredura sĂŁo propostos para as transformadas. Os coeficiente 3-D sĂŁo codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saĂ­da totalmente embutida. Todas as implementaçÔes sĂŁo feitas usando arquitetura com aritmĂ©tica inteira de 16 bits. Somente adiçÔes e deslocamentos de bits sĂŁo necessĂĄrios, o que reduz a complexidade computacional. Mesmo com essas restriçÔes, um bom desempenho em termos de taxa de bits versus distorção pĂŽde ser obtido e os tempos de codificação sĂŁo significativamente menores (em torno de 160 vezes) quando comparados ao padrĂŁo H.264/AVC. A segunda contribuição Ă© a otimização de uma recente abordagem proposta para codificação de vĂ­deo de mĂșltiplas vistas em aplicaçÔes de video-conferĂȘncia e outras aplicaçÔes do tipo "unicast" similares. O cenĂĄrio alvo nessa abordagem Ă© fornecer vĂ­deo com percepção real em 3-D e ponto de vista livre a boas taxas de compressĂŁo. Para atingir tal objetivo, pesos sĂŁo atribuĂ­dos a cada vista e mapeados em parĂąmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parĂąmetros de quantização Ă© mostrado ser quase-Ăłtimo para uma fonte Gaussiana e um mapeamento Ăłtimo Ă© derivado para fonte tĂ­picas de vĂ­deo. A terceira contribuição explora vĂĄrias estratĂ©gias para varredura adaptativa dos coeficientes da transformada no padrĂŁo JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR Ă© comparada com os mĂ©todos de varredura localizados e hĂ­bridos propostos neste trabalho. Essas novas ordens nĂŁo requerem mudanças nem nos outros estĂĄgios de codificação e decodificação, nem na definição da bitstream A quarta e Ășltima contribuição propĂ”e uma transformada por blocos dependente do sinal. As transformadas hierĂĄrquicas usualmente exploram a informação residual entre os nĂ­veis no estĂĄgio da codificação de entropia, mas nĂŁo no estĂĄgio da transformada. A transformada proposta neste trabalho Ă© uma tĂ©cnica de compactação de energia que tambĂ©m explora as similaridades estruturais entre os nĂ­veis de resolução. A idĂ©ia central da tĂ©cnica Ă© incluir na transformada hierĂĄrquica um nĂșmero de funçÔes de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos sĂŁo discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicaçÔes e TelemĂĄticaDoutor em Engenharia ElĂ©tric

    Efficient compression of motion compensated residuals

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Mode decision for the H.264/AVC video coding standard

    Get PDF
    H.264/AVC video coding standard gives us a very promising future for the field of video broadcasting and communication because of its high coding efficiency compared with other older video coding standards. However, high coding efficiency also carries high computational complexity. Fast motion estimation and fast mode decision are two very useful techniques which can significantly reduce computational complexity. This thesis focuses on the field of fast mode decision. The goal of this thesis is that for very similar RD performance compared with H.264/AVC video coding standard, we aim to find new fast mode decision techniques which can afford significant time savings. [Continues.

    Distortion-constraint compression of three-dimensional CLSM images using image pyramid and vector quantization

    Get PDF
    The confocal microscopy imaging techniques, which allow optical sectioning, have been successfully exploited in biomedical studies. Biomedical scientists can benefit from more realistic visualization and much more accurate diagnosis by processing and analysing on a three-dimensional image data. The lack of efficient image compression standards makes such large volumetric image data slow to transfer over limited bandwidth networks. It also imposes large storage space requirements and high cost in archiving and maintenance. Conventional two-dimensional image coders do not take into account inter-frame correlations in three-dimensional image data. The standard multi-frame coders, like video coders, although they have good performance in capturing motion information, are not efficiently designed for coding multiple frames representing a stack of optical planes of a real object. Therefore a real three-dimensional image compression approach should be investigated. Moreover the reconstructed image quality is a very important concern in compressing medical images, because it could be directly related to the diagnosis accuracy. Most of the state-of-the-arts methods are based on transform coding, for instance JPEG is based on discrete-cosine-transform CDCT) and JPEG2000 is based on discrete- wavelet-transform (DWT). However in DCT and DWT methods, the control of the reconstructed image quality is inconvenient, involving considerable costs in computation, since they are fundamentally rate-parameterized methods rather than distortion-parameterized methods. Therefore it is very desirable to develop a transform-based distortion-parameterized compression method, which is expected to have high coding performance and also able to conveniently and accurately control the final distortion according to the user specified quality requirement. This thesis describes our work in developing a distortion-constraint three-dimensional image compression approach, using vector quantization techniques combined with image pyramid structures. We are expecting our method to have: 1. High coding performance in compressing three-dimensional microscopic image data, compared to the state-of-the-art three-dimensional image coders and other standardized two-dimensional image coders and video coders. 2. Distortion-control capability, which is a very desirable feature in medical 2. Distortion-control capability, which is a very desirable feature in medical image compression applications, is superior to the rate-parameterized methods in achieving a user specified quality requirement. The result is a three-dimensional image compression method, which has outstanding compression performance, measured objectively, for volumetric microscopic images. The distortion-constraint feature, by which users can expect to achieve a target image quality rather than the compressed file size, offers more flexible control of the reconstructed image quality than its rate-constraint counterparts in medical image applications. Additionally, it effectively reduces the artifacts presented in other approaches at low bit rates and also attenuates noise in the pre-compressed images. Furthermore, its advantages in progressive transmission and fast decoding make it suitable for bandwidth limited tele-communications and web-based image browsing applications

    Surveillance centric coding

    Get PDF
    PhDThe research work presented in this thesis focuses on the development of techniques specific to surveillance videos for efficient video compression with higher processing speed. The Scalable Video Coding (SVC) techniques are explored to achieve higher compression efficiency. The framework of SVC is modified to support Surveillance Centric Coding (SCC). Motion estimation techniques specific to surveillance videos are proposed in order to speed up the compression process of the SCC. The main contributions of the research work presented in this thesis are divided into two groups (i) Efficient Compression and (ii) Efficient Motion Estimation. The paradigm of Surveillance Centric Coding (SCC) is introduced, in which coding aims to achieve bit-rate optimisation and adaptation of surveillance videos for storing and transmission purposes. In the proposed approach the SCC encoder communicates with the Video Content Analysis (VCA) module that detects events of interest in video captured by the CCTV. Bit-rate optimisation and adaptation are achieved by exploiting the scalability properties of the employed codec. Time segments containing events relevant to surveillance application are encoded using high spatiotemporal resolution and quality while the irrelevant portions from the surveillance standpoint are encoded at low spatio-temporal resolution and / or quality. Thanks to the scalability of the resulting compressed bit-stream, additional bit-rate adaptation is possible; for instance for the transmission purposes. Experimental evaluation showed that significant reduction in bit-rate can be achieved by the proposed approach without loss of information relevant to surveillance applications. In addition to more optimal compression strategy, novel approaches to performing efficient motion estimation specific to surveillance videos are proposed and implemented with experimental results. A real-time background subtractor is used to detect the presence of any motion activity in the sequence. Different approaches for selective motion estimation, GOP based, Frame based and Block based, are implemented. In the former, motion estimation is performed for the whole group of pictures (GOP) only when a moving object is detected for any frame of the GOP. iii While for the Frame based approach; each frame is tested for the motion activity and consequently for selective motion estimation. The selective motion estimation approach is further explored at a lower level as Block based selective motion estimation. Experimental evaluation showed that significant reduction in computational complexity can be achieved by applying the proposed strategy. In addition to selective motion estimation, a tracker based motion estimation and fast full search using multiple reference frames has been proposed for the surveillance videos. Extensive testing on different surveillance videos shows benefits of application of proposed approaches to achieve the goals of the SCC

    Transputer-based parallel system for MPEG-1 decoder

    Full text link

    Compression of Three-Dimensional Magnetic Resonance Brain Images.

    Get PDF
    Losslessly compressing a medical image set with multiple slices is paramount in radiology since all the information within a medical image set is crucial for both diagnosis and treatment. This dissertation presents a novel and efficient diagnostically lossless compression scheme (predicted wavelet lossless compression method) for sets of magnetic resonance (MR) brain images, which are called 3-D MR brain images. This compression scheme provides 3-D MR brain images with the progressive and preliminary diagnosis capabilities. The spatial dependency in 3-D MR brain images is studied with histograms, entropy, correlation, and wavelet decomposition coefficients. This spatial dependency is utilized to design three kinds of predictors, i.e., intra-, inter-, and intra-and-inter-slice predictors, that use the correlation among neighboring pixels. Five integer wavelet transformations are applied to the prediction residues. It shows that the intra-slice predictor 3 using a x-pixel and a y-pixel for prediction plus the 1st-level (2, 2) interpolating integer wavelet with run-length and arithmetic coding achieves the best compression. An automated threshold based background noise removal technique is applied to remove the noise outside the diagnostic region. This preprocessing method improves the compression ratio of the proposed compression technique by approximately 1.61 times. A feature vector based approach is used to determine the representative slice with the most discernible brain structures. This representative slice is progressively encoded by a lossless embedded zerotree wavelet method. A rough version of this representative slice is gradually transmitted at an increasing bit rate so the validity of the whole set can be determined early. This feature vector based approach is also utilized to detect multiple sclerosis (MS) at an early stage. Our compression technique with the progressive and preliminary diagnosis capability is tested with simulated and real 3-D MR brain image sets. The compression improvement versus the best commonly used lossless compression method (lossless JPEG) is 41.83% for simulated 3-D MR brain image sets and 71.42% for real 3-D MR brain image sets. The accuracy of the preliminary MS diagnosis is 66.67% based on six studies with an expert radiologist\u27s diagnosis

    Improved quality block-based low bit rate video coding.

    Get PDF
    The aim of this research is to develop algorithms for enhancing the subjective quality and coding efficiency of standard block-based video coders. In the past few years, numerous video coding standards based on motion-compensated block-transform structure have been established where block-based motion estimation is used for reducing the correlation between consecutive images and block transform is used for coding the resulting motion-compensated residual images. Due to the use of predictive differential coding and variable length coding techniques, the output data rate exhibits extreme fluctuations. A rate control algorithm is devised for achieving a stable output data rate. This rate control algorithm, which is essentially a bit-rate estimation algorithm, is then employed in a bit-allocation algorithm for improving the visual quality of the coded images, based on some prior knowledge of the images. Block-based hybrid coders achieve high compression ratio mainly due to the employment of a motion estimation and compensation stage in the coding process. The conventional bit-allocation strategy for these coders simply assigns the bits required by the motion vectors and the rest to the residual image. However, at very low bit-rates, this bit-allocation strategy is inadequate as the motion vector bits takes up a considerable portion of the total bit-rate. A rate-constrained selection algorithm is presented where an analysis-by-synthesis approach is used for choosing the best motion vectors in term of resulting bit rate and image quality. This selection algorithm is then implemented for mode selection. A simple algorithm based on the above-mentioned bit-rate estimation algorithm is developed for the latter to reduce the computational complexity. For very low bit-rate applications, it is well-known that block-based coders suffer from blocking artifacts. A coding mode is presented for reducing these annoying artifacts by coding a down-sampled version of the residual image with a smaller quantisation step size. Its applications for adaptive source/channel coding and for coding fast changing sequences are examined
    corecore