    Contours and contrast

    Contrast in photographic and computer-generated imagery communicates colour and lightness differences that would be perceived when viewing the represented scene. Due to depiction constraints, the amount of displayable contrast is limited, reducing the image's ability to accurately represent the scene. A local contrast enhancement technique called unsharp masking can overcome these constraints by adding high-frequency contours to an image that increase its apparent contrast. In three novel algorithms inspired by unsharp masking, specialized local contrast enhancements are shown to overcome constraints of a limited dynamic range, overcome an achromatic palette, and to improve the rendering of 3D shapes and scenes. The Beyond Tone Mapping approach restores original HDR contrast to its tone mapped LDR counterpart by adding highfrequency colour contours to the LDR image while preserving its luminance. Apparent Greyscale is a multi-scale two-step technique that first converts colour images and video to greyscale according to their chromatic lightness, then restores diminished colour contrast with high-frequency luminance contours. Finally, 3D Unsharp Masking performs scene coherent enhancement by introducing 3D high-frequency luminance contours to emphasize the details, shapes, tonal range and spatial organization of a 3D scene within the rendering pipeline. As a perceptual justification, it is argued that a local contrast enhancement made with unsharp masking is related to the Cornsweet illusion, and that this may explain its effect on apparent contrast.Seit vielen Jahren ist die realistische Erzeugung von virtuellen Charakteren ein zentraler Teil der Computergraphikforschung. Dennoch blieben bisher einige Probleme ungelöst. Dazu zählt unter anderem die Erzeugung von Charakteranimationen, welche unter der Benutzung der traditionellen, skelettbasierten Ansätze immer noch zeitaufwändig sind. Eine weitere Herausforderung stellt auch die passive Erfassung von Schauspielern in alltäglicher Kleidung dar. Darüber hinaus existieren im Gegensatz zu den zahlreichen skelettbasierten Ansätzen nur wenige Methoden zur Verarbeitung und Veränderung von Netzanimationen. In dieser Arbeit präsentieren wir Algorithmen zur Lösung jeder dieser Aufgaben. Unser erster Ansatz besteht aus zwei Netz-basierten Verfahren zur Vereinfachung von Charakteranimationen. Obwohl das kinematische Skelett beiseite gelegt wird, können beide Verfahren direkt in die traditionelle Pipeline integriert werden, wobei die Erstellung von Animationen mit wirklichkeitsgetreuen Körperverformungen ermöglicht wird. Im Anschluss präsentieren wir drei passive Aufnahmemethoden für Körperbewegung und Schauspiel, die ein deformierbares 3D-Modell zur Repräsentation der Szene benutzen. Diese Methoden können zur gemeinsamen Rekonstruktion von zeit- und raummässig kohärenter Geometrie, Bewegung und Oberflächentexturen benutzt werden, die auch zeitlich veränderlich sein dürfen. Aufnahmen von lockerer und alltäglicher Kleidung sind dabei problemlos möglich. Darüber hinaus ermöglichen die qualitativ hochwertigen Rekonstruktionen die realistische Darstellung von 3D Video-Sequenzen. Schließlich werden zwei neuartige Algorithmen zur Verarbeitung von Netz-Animationen beschrieben. Während der erste Algorithmus die vollautomatische Umwandlung von Netz-Animationen in skelettbasierte Animationen ermöglicht, erlaubt der zweite die automatische Konvertierung von Netz-Animationen in so genannte Animations-Collagen, einem neuen Kunst-Stil zur Animationsdarstellung. Die in dieser Dissertation beschriebenen Methoden können als Lösungen spezieller Probleme, aber auch als wichtige Bausteine größerer Anwendungen betrachtet werden. Zusammengenommen bilden sie ein leistungsfähiges System zur akkuraten Erfassung, zur Manipulation und zum realistischen Rendern von künstlerischen Aufführungen, dessen Fähigkeiten über diejenigen vieler verwandter Capture-Techniken hinausgehen. Auf diese Weise können wir die Bewegung, die im Zeitverlauf variierenden Details und die Textur-Informationen eines Schauspielers erfassen und sie in eine mit vollständiger Information versehene Charakter-Animation umwandeln, die unmittelbar weiterverwendet werden kann, sich aber auch zur realistischen Darstellung des Schauspielers aus beliebigen Blickrichtungen eignet

    Codificação de vídeo: priorização do menor custo de codificação na otimização em taxa-distorção

    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Engenharia Elétrica, Florianópolis, 2009.O presente trabalho propõe duas novas estratégias para compressão de sinais de vídeo através de algoritmos otimizados em taxa-distorção (RD), focando aplicações típicas de vídeo digital para operação em baixas taxas de bits. As estratégias propostas são implementadas em um codificador de vídeo baseado no padrão H.264, o qual apresenta uma alta complexidade computacional devido principalmente ao grande número de modos de codificação disponível. São apresentadas duas propostas de redução da complexidade, mantendo o desempenho RD próximo àquele do codificador H.264 otimizado em RD usando busca exaustiva. A primeira proposta (denominada rate sorting and truncation - RST) realiza o ordenamento tanto dos vetores de movimento (VMs) quanto dos modos de codificação em ordem ascendente de taxa de bits. O processo de codificação é interrompido quando a taxa de bits dos novos VMs e modos de codificação exceder à menor taxa já obtida para um pré-estabelecido nível de qualidade de imagem. Assim, um grande número de VMs e diversos modos de codificação são descartados antes que sejam avaliados. A segunda proposta consiste em um algoritmo rápido, baseado no perfil de distribuição de vetores do codificador H.264, para estimação de movimento (denominado logarithmic diamond shape search - LDSS). O uso da estratégia RST associada ao algoritmo LDSS reduz até 98% a carga computacional com perda marginal de desempenho RD.This research work proposes two new video compression strategies, aiming at typical low bit rate video applications using rate-distortion (RD) optimized algorithms. The proposed strategies are implemented on an H.264 video encoder, which has high computational complexity due mainly to the large number of coding modes available. Two approaches are presented for reducing the encoder computational complexity, maintaining the RD performance close to the full search RD optimized H.264 encoder. The first approach (termed rate sorting and truncation - RST) is based on sorting the motion vectors (MVs) and coding modes in an ascending rate order. This sorting and encoding process, which is stopped when the rate value exceeds the previous best rate for a required image quality level, allows the elimination of MVs and coding modes before checking their distortion. Apart from obtaining a significant complexity reduction, the process still remains optimized in RD sense. The second approach is an algorithm (termed logarithmic diamond shape search - LDSS), which explores the MVs distribution profile for the RD optimized H.264 encoder. The use of the RST strategy associated with LDSS algorithm yields up to a 98% reduction in the computational burden, with insignificant RD performance loss

    SPIHT image coding : analysis, improvements and applications.

    Image compression plays an important role in image storage and transmission. In the popular Internet applications and mobile communications, image coding is required to be not only efficient but also scalable. Recent wavelet techniques provide a way for efficient and scalable image coding. SPIHT (set partitioning in hierarchical trees) is such an algorithm based on wavelet transform. This thesis analyses and improves the SPIHT algorithm. The preliminary part of the thesis investigates two-dimensional multi-resolution decomposition for image coding using the wavelet transform, which is reviewed and analysed systematically. The wavelet transform is implemented using filter banks, and the z-domain proofs are given for the key implementation steps. A scheme of wavelet transform for arbitrarily sized images is proposed. The statistical properties of the wavelet coefficients (being the output of the wavelet transform) are explored for natural images. The energy in the transform domain is localised and highly concentrated on the low-resolution subband. The wavelet coefficients are DC-biased, and the gravity centre of most octave-segmented value sections (which are relevant to the binary bit-planes) is offset by approximately one eighth of the section range from the geometrical centre. The intra-subband correlation coefficients are the largest, followed by the inter-level correlation coefficients in the middle then the trivial inter-subband correlation coefficients on the same resolution level. The statistical properties reveal the success of the SPIHT algorithm, and lead to further improvements. The subsequent parts of the thesis examine the SPIHT algorithm. The concepts of successive approximation quantisation and ordered bit-plane coding are highlighted. The procedure of SPIHT image coding is demonstrated with a simple example. A solution for arbitrarily sized images is proposed. Seven measures are proposed to improve the SPIHT algorithm. Three DC-level shifting schemes are discussed, and the one subtracting the geometrical centre in the image domain is selected in the thesis. The virtual trees are introduced to hold more wavelet coefficients in each of the initial sets. A scheme is proposed to reduce the redundancy in the coding bit-stream by omitting the predictable symbols. The quantisation of wavelet coefficients is offset by one eighth from the geometrical centre. A pre-processing technique is proposed to speed up the judgement of the significance of trees, and a smoothing is imposed on the magnitude of the wavelet coefficients during the pre-processing for lossy image coding. The optimisation of arithmetic coding is also discussed. Experimental results show that these improvements to SPIHT get a significant performance gain. The running time is reduced by up to a half. The PSNR (peak signal to noise ratio) is improved a lot at very low bit rates, up to 12 dB in the extreme case. Moderate improvements are also made at high bit rates. The SPIHT algorithm is applied to loss less image coding. Various wavelet transforms are evaluated for lossless SPIHT image coding. Experimental results show that the interpolating transform (4, 4) and the S+P transform (2+2, 2) are the best for natural images among the transforms used, the interpolating transform (4, 2) is the best for CT images, and the bi-orthogonal transform (9, 7) is always the worst. Content-based lossless coding of a CT head image is presented in the thesis, using segmentation and SPIHT. Although the performance gain is limited in the experiments, it shows the potential advantage of content-based image coding

    An Overview of the Visual Optimization Tools in JPEG 2000

    The human visual system plays a key role in the final perceived quality of the compressed images. It is therefore desirable to allow system designers and users to take advantage of the current knowledge of visual perception and models in a compression system. In this paper, we review the various tools in JPEG-2000 that allow its users to exploit many properties of the human visual system such as spatial frequency sensitivity, color sensitivity, and the visual masking effects. We show that the visual tool sets in JPEG-2000 are much richer than what is achievable in JPEG, where only spatially invariant frequency weighting can be exploited. As a result, the visually optimized JPEG2000 images can usually have much better visual quality than the visually optimized JPEG images at the same bit rates. Some visual comparisons between different visual optimization tools, as well as some visual comparisons between JPEG-2000 and JPEG, will be shown.