626 research outputs found

    End-to-end security for video distribution

    Get PDF

    Design and Optimization of Graph Transform for Image and Video Compression

    Get PDF
    The main contribution of this thesis is the introduction of new methods for designing adaptive transforms for image and video compression. Exploiting graph signal processing techniques, we develop new graph construction methods targeted for image and video compression applications. In this way, we obtain a graph that is, at the same time, a good representation of the image and easy to transmit to the decoder. To do so, we investigate different research directions. First, we propose a new method for graph construction that employs innovative edge metrics, quantization and edge prediction techniques. Then, we propose to use a graph learning approach and we introduce a new graph learning algorithm targeted for image compression that defines the connectivities between pixels by taking into consideration the coding of the image signal and the graph topology in rate-distortion term. Moreover, we also present a new superpixel-driven graph transform that uses clusters of superpixel as coding blocks and then computes the graph transform inside each region. In the second part of this work, we exploit graphs to design directional transforms. In fact, an efficient representation of the image directional information is extremely important in order to obtain high performance image and video coding. In this thesis, we present a new directional transform, called Steerable Discrete Cosine Transform (SDCT). This new transform can be obtained by steering the 2D-DCT basis in any chosen direction. Moreover, we can also use more complex steering patterns than a single pure rotation. In order to show the advantages of the SDCT, we present a few image and video compression methods based on this new directional transform. The obtained results show that the SDCT can be efficiently applied to image and video compression and it outperforms the classical DCT and other directional transforms. Along the same lines, we present also a new generalization of the DFT, called Steerable DFT (SDFT). Differently from the SDCT, the SDFT can be defined in one or two dimensions. The 1D-SDFT represents a rotation in the complex plane, instead the 2D-SDFT performs a rotation in the 2D Euclidean space

    Energy efficient hardware acceleration of multimedia processing tools

    Get PDF
    The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

    A network transparent, retained mode multimedia processing framework for the Linux operating system environment

    Get PDF
    Die Arbeit prĂ€sentiert ein Multimedia-Framework fĂŒr Linux, das im Unterschied zu frĂŒheren Arbeiten auf den Ideen "retained-mode processing" und "lazy evaluation" basiert: Statt Transformationen unmittelbar auszufĂŒhren, wird eine abstrakte ReprĂ€sentation aller Medienelemente aufgebaut. "renderer"-Treiber fungieren als Übersetzer, die diese Darstellung zur Laufzeit in konkrete Operationen umsetzen, wobei das Datenmodell zahlreiche Optimierungen zur Reduktion der Anzahl der Schritte oder der Minimierung von Kommunikation erlaubt. Dies erlaubt ein stark vereinfachtes Programmiermodell bei gleichzeitiger Effizienzsteigerung. "renderer"-Treiber können zur AusfĂŒhrung von Transformationen den lokalen Prozessor verwenden, oder können die Operationen delegieren. In der Arbeit wird eine Erweiterung des X Window Systems um Mechanismen zur Medienverarbeitung vorgestellt, sowie ein "renderer"-Treiber, der diese zur Delegation der Verarbeitung nutzt

    Voronoi Constellations for Coherent Fiber-Optic Communication Systems

    Get PDF
    The increasing demand for higher data rates is driving the adoption of high-spectral-efficiency (SE) transmission in communication systems. The well-known 1.53 dB gap between Shannon\u27s capacity and the mutual information (MI) of uniform quadrature amplitude modulation (QAM) formats indicates the importance of power efficiency, particularly in high-SE transmission scenarios, such as fiber-optic communication systems and wireless backhaul links. Shaping techniques are the only way to close this gap, by adapting the uniform input distribution to the capacity-achieving distribution. The two categories of shaping are probabilistic shaping (PS) and geometric shaping (GS). Various methods have been proposed for performing PS and GS, each with distinct implementation complexity and performance characteristics. In general, the complexity of these methods grows dramatically with the SE and number of dimensions.Among different methods, multidimensional Voronoi constellations (VCs) provide a good trade-off between high shaping gains and low-complexity encoding/decoding algorithms due to their nice geometric structures. However, VCs with high shaping gains are usually very large and the huge cardinality makes system analysis and design cumbersome, which motives this thesis.In this thesis, we develop a set of methods to make VCs applicable to communication systems with a low complexity. The encoding and decoding, labeling, and coded modulation schemes of VCs are investigated. Various system performance metrics including uncoded/coded bit error rate, MI, and generalized mutual information (GMI) are studied and compared with QAM formats for both the additive white Gaussian noise channel and nonlinear fiber channels. We show that the proposed methods preserve high shaping gains of VCs, enabling significant improvements on system performance for high-SE transmission in both the additive white Gaussian noise channel and nonlinear fiber channel. In addition, we propose general algorithms for estimating the MI and GMI, and approximating the log-likelihood ratios in soft-decision forward error correction codes for very large constellations

    DIAS Research Report 2006

    Get PDF

    Scalable video compression with optimized visual performance and random accessibility

    Full text link
    This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved. The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling. The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field. The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate. For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video
    • 

    corecore