27 research outputs found

    Fully Scalable Video Coding Using Redundant-Wavelet Multihypothesis and Motion-Compensated Temporal Filtering

    Get PDF
    In this dissertation, a fully scalable video coding system is proposed. This system achieves full temporal, resolution, and fidelity scalability by combining mesh-based motion-compensated temporal filtering, multihypothesis motion compensation, and an embedded 3D wavelet-coefficient coder. The first major contribution of this work is the introduction of the redundant-wavelet multihypothesis paradigm into motion-compensated temporal filtering, which is achieved by deploying temporal filtering in the domain of a spatially redundant wavelet transform. A regular triangle mesh is used to track motion between frames, and an affine transform between mesh triangles implements motion compensation within a lifting-based temporal transform. Experimental results reveal that the incorporation of redundant-wavelet multihypothesis into mesh-based motion-compensated temporal filtering significantly improves the rate-distortion performance of the scalable coder. The second major contribution is the introduction of a sliding-window implementation of motion-compensated temporal filtering such that video sequences of arbitrarily length may be temporally filtered using a finite-length frame buffer without suffering from severe degradation at buffer boundaries. Finally, as a third major contribution, a novel 3D coder is designed for the coding of the 3D volume of coefficients resulting from the redundant-wavelet based temporal filtering. This coder employs an explicit estimate of the probability of coefficient significance to drive a nonadaptive arithmetic coder, resulting in a simple software implementation. Additionally, the coder offers the possibility of a high degree of vectorization particularly well suited to the data-parallel capabilities of modern general-purpose processors or customized hardware. Results show that the proposed coder yields nearly the same rate-distortion performance as a more complicated coefficient coder considered to be state of the art

    Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions

    Get PDF
    PhDThe research presented in this thesis aims to extend the scalability range of the wavelet-based video coding systems in order to achieve fully scalable coding with a wide range of available decoding points. Since the temporal redundancy regularly comprises the main portion of the global video sequence redundancy, the techniques that can be generally termed motion decorrelation techniques have a central role in the overall compression performance. For this reason the scalable motion modelling and coding are of utmost importance, and specifically, in this thesis possible solutions are identified and analysed. The main contributions of the presented research are grouped into two interrelated and complementary topics. Firstly a flexible motion model with rateoptimised estimation technique is introduced. The proposed motion model is based on tree structures and allows high adaptability needed for layered motion coding. The flexible structure for motion compensation allows for optimisation at different stages of the adaptive spatio-temporal decomposition, which is crucial for scalable coding that targets decoding on different resolutions. By utilising an adaptive choice of wavelet filterbank, the model enables high compression based on efficient mode selection. Secondly, solutions for scalable motion modelling and coding are developed. These solutions are based on precision limiting of motion vectors and creation of a layered motion structure that describes hierarchically coded motion. The solution based on precision limiting relies on layered bit-plane coding of motion vector values. The second solution builds on recently established techniques that impose scalability on a motion structure. The new approach is based on two major improvements: the evaluation of distortion in temporal Subbands and motion search in temporal subbands that finds the optimal motion vectors for layered motion structure. Exhaustive tests on the rate-distortion performance in demanding scalable video coding scenarios show benefits of application of both developed flexible motion model and various solutions for scalable motion coding

    Frame-based multiple-description video coding with extended orthogonal filter banks

    Get PDF
    We propose a frame-based multiple-description video coder. The analysis filter bank is the extension of an orthogonal filter bank which computes the spatial polyphase components of the original video frames. The output of the filter bank is a set of video sequences which can be compressed with a standard coder. The filter bank design is carried out by taking into account two important requirements for video coding, namely, the fact that the dual synthesis filter bank is FIR, and that loss recovery does not enhance the quantization error. We give explicit results about the required properties of the redundant channel filter and the reconstruction error bounds in case of packet errors. We show that the proposed scheme has good error robustness to losses and good performance, both in terms of objective and visual quality, when compared to single description and other multiple description video coders based on spatial subsampling. PSNR gains of 5 dB or more are typical for packet loss probability as low as 5%

    Polarization image laser line extraction methods for reflective metal surfaces

    Get PDF
    In this work, we propose a novel pipeline method for laser line extraction from images with a polarization image sensor. The proposed method is specially developed for strong laser beam reflections from metal surfaces. For the pre-processing stage, we propose a demosaicing algorithm for color polarizer filter array (CPFA) sensors. This can be implemented by using either one quarter or full resolution of the sensor. In addition, we propose two methods for optimizing the information available in a 12-channel color polarization image: The first method, is based on the minimum linearly polarized irradiance, and the second method, is based on the linear polarization intensity. These pre-processing, and optimization methods are combined with laser line extraction methods. The laser line extraction is done with either the Polarized Finite Impulse Response (FIR) Center Of Gravity (COG), where the laser line coordinates are computed from the filtered laser intensity distribution, or with the Polarized FIR-Peak, where the laser line coordinates are calculated from the first derivative of the filtered laser signal. The performance of the proposed algorithms is studied experimentally using a laser line scanner assembly, made of a polarization camera, and a laser line projector operating in the blue wavelength range.acceptedVersio

    Optimal prefilters for display enhancement

    Get PDF
    Creating images from a set of discrete samples is arguably the most common operation in computer graphics and image processing, lying, for example, at the heart of rendering and image downscaling techniques. Traditional tools for this task are based on classic sampling theory and are modeled under mathematical conditions which are, in most cases, unrealistic; for example, sinc reconstruction – required by Shannon theorem in order to recover a signal exactly – is impossible to achieve in practice because LCD displays perform a box-like interpolation of the samples. Moreover, when an image is made for a human to look at, it will necessarily undergo some modifications due to the human optical system and all the neural processes involved in vision. Finally, image processing practitioners noticed that sinc prefiltering – also required by Shannon theorem – often leads to visually unpleasant images. From these facts, we can deduce that we cannot guarantee, via classic sampling theory, that the signal we see in a display is the best representation of the original image we had in first place. In this work, we propose a novel family of image prefilters based on modern sampling theory, and on a simple model of how the human visual system perceives an image on a display. The use of modern sampling theory guarantees us that the perceived image, based on this model, is indeed the best representation possible, and at virtually no computational overhead. We analyze the spectral properties of these prefilters, showing that they offer the possibility of trading-off aliasing and ringing, while guaranteeing that images look sharper then those generated with both classic and state-of-the-art filters. Finally, we compare it against other solutions in a selection of applications which include Monte Carlo rendering and image downscaling, also giving directions on how to apply it in different contexts.Exibir imagens a partir de um conjunto discreto de amostras é certamente uma das operações mais comuns em computação gráfica e processamento de imagens. Ferramentas tradicionais para essa tarefa são baseadas no teorema de Shannon e são modeladas em condições matemáticas que são, na maior parte dos casos, irrealistas; por exemplo, reconstrução com sinc – necessária pelo teorema de Shannon para recuperar um sinal exatamente – é impossível na prática, já que displays LCD realizam uma reconstrução mais próxima de uma interpolação com kernel box. Além disso, profissionais em processamento de imagem perceberam que prefiltragem com sinc – também requerida pelo teorema de Shannon – em geral leva a imagens visualmente desagradáveis devido ao fenômeno de ringing: oscilações próximas a regiões de descontinuidade nas imagens. Desses fatos, deduzimos que não é possível garantir, via ferramentas tradicionais de amostragem e reconstrução, que a imagem que observamos em um display digital é a melhor representação para a imagem original. Neste trabalho, propomos uma família de prefiltros baseada em teoria de amostragem generalizada e em um modelo de como o sistema ótico do olho humano modifica uma imagem. Proposta por Unser and Aldroubi (1994), a teoria de amostragem generalizada é mais geral que o teorema proposto por Shannon, e mostra como é possível pré-filtrar e reconstruir sinais usando kernels diferentes do sinc. Modelamos o sistema ótico do olho como uma câmera com abertura finita e uma lente delgada, o que apesar de ser simples é suficiente para os nossos propósitos. Além de garantir aproximação ótima quando reconstruindo as amostras por um display e filtrando a imagem com o modelo do sistema ótico humano, a teoria de amostragem generalizada garante que essas operações são extremamente eficientes, todas lineares no número de pixels de entrada. Também, analisamos as propriedades espectrais desses filtros e de técnicas semelhantes na literatura, mostrando que é possível obter um bom tradeoff entre aliasing e ringing (principais artefatos quando lidamos com amostragem e reconstrução de imagens), enquanto garantimos que as imagens finais são mais nítidas que aquelas geradas por técnicas existentes na literatura. Finalmente, mostramos algumas aplicações da nossa técnica em melhoria de imagens, adaptação à distâncias de visualização diferentes, redução de imagens e renderização de imagens sintéticas por método de Monte Carlo

    Modeling the television process

    Get PDF
    Also issued as Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1986.Includes bibliographical references.Supported in part by members of the Center for Advanced Television Studies.Michael Anthony Isnardi

    Compression of Spectral Images

    Get PDF

    Perceptually Optimized Visualization on Autostereoscopic 3D Displays

    Get PDF
    The family of displays, which aims to visualize a 3D scene with realistic depth, are known as "3D displays". Due to technical limitations and design decisions, such displays create visible distortions, which are interpreted by the human vision as artefacts. In absence of visual reference (e.g. the original scene is not available for comparison) one can improve the perceived quality of the representations by making the distortions less visible. This thesis proposes a number of signal processing techniques for decreasing the visibility of artefacts on 3D displays. The visual perception of depth is discussed, and the properties (depth cues) of a scene which the brain uses for assessing an image in 3D are identified. Following the physiology of vision, a taxonomy of 3D artefacts is proposed. The taxonomy classifies the artefacts based on their origin and on the way they are interpreted by the human visual system. The principles of operation of the most popular types of 3D displays are explained. Based on the display operation principles, 3D displays are modelled as a signal processing channel. The model is used to explain the process of introducing distortions. It also allows one to identify which optical properties of a display are most relevant to the creation of artefacts. A set of optical properties for dual-view and multiview 3D displays are identified, and a methodology for measuring them is introduced. The measurement methodology allows one to derive the angular visibility and crosstalk of each display element without the need for precision measurement equipment. Based on the measurements, a methodology for creating a quality profile of 3D displays is proposed. The quality profile can be either simulated using the angular brightness function or directly measured from a series of photographs. A comparative study introducing the measurement results on the visual quality and position of the sweet-spots of eleven 3D displays of different types is presented. Knowing the sweet-spot position and the quality profile allows for easy comparison between 3D displays. The shape and size of the passband allows depth and textures of a 3D content to be optimized for a given 3D display. Based on knowledge of 3D artefact visibility and an understanding of distortions introduced by 3D displays, a number of signal processing techniques for artefact mitigation are created. A methodology for creating anti-aliasing filters for 3D displays is proposed. For multiview displays, the methodology is extended towards so-called passband optimization which addresses Moiré, fixed-pattern-noise and ghosting artefacts, which are characteristic for such displays. Additionally, design of tuneable anti-aliasing filters is presented, along with a framework which allows the user to select the so-called 3d sharpness parameter according to his or her preferences. Finally, a set of real-time algorithms for view-point-based optimization are presented. These algorithms require active user-tracking, which is implemented as a combination of face and eye-tracking. Once the observer position is known, the image on a stereoscopic display is optimised for the derived observation angle and distance. For multiview displays, the combination of precise light re-direction and less-precise face-tracking is used for extending the head parallax. For some user-tracking algorithms, implementation details are given, regarding execution of the algorithm on a mobile device or on desktop computer with graphical accelerator

    Wavelets and Subband Coding

    Get PDF
    First published in 1995, Wavelets and Subband Coding offered a unified view of the exciting field of wavelets and their discrete-time cousins, filter banks, or subband coding. The book developed the theory in both continuous and discrete time, and presented important applications. During the past decade, it filled a useful need in explaining a new view of signal processing based on flexible time-frequency analysis and its applications. Since 2007, the authors now retain the copyright and allow open access to the book

    Digital Filters and Signal Processing

    Get PDF
    Digital filters, together with signal processing, are being employed in the new technologies and information systems, and are implemented in different areas and applications. Digital filters and signal processing are used with no costs and they can be adapted to different cases with great flexibility and reliability. This book presents advanced developments in digital filters and signal process methods covering different cases studies. They present the main essence of the subject, with the principal approaches to the most recent mathematical models that are being employed worldwide
    corecore