6,996 research outputs found

    Filter optimization and complexity reduction for video coding using graph-based transforms

    Get PDF
    The basis functions of lifting transform on graphs are completely determined by finding a bipartition of the graph and defining the prediction and update filters to be used. In this work we consider the design of prediction filters that minimize the quadratic prediction error and therefore the energy of the detail coefficients, which will give rise to higher energy compaction. Then, to determine the graph bipartition, we propose a distributed maximum-cut algorithm that significantly reduces the computational cost with respect to the centralized version used in our previous work. The proposed techniques show improvements in coding performance and computational cost as compared to our previous work.This work was supported in part by NSF under grant CCF-1018977Publicad

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    Graph Signal Processing: Overview, Challenges and Applications

    Full text link
    Research in Graph Signal Processing (GSP) aims to develop tools for processing data defined on irregular graph domains. In this paper we first provide an overview of core ideas in GSP and their connection to conventional digital signal processing. We then summarize recent developments in developing basic GSP tools, including methods for sampling, filtering or graph learning. Next, we review progress in several application areas using GSP, including processing and analysis of sensor network data, biological data, and applications to image processing and machine learning. We finish by providing a brief historical perspective to highlight how concepts recently developed in GSP build on top of prior research in other areas.Comment: To appear, Proceedings of the IEE

    On the rate-distortion performance and computational efficiency of the Karhunen-Loeve transform for lossy data compression

    Get PDF
    We examine the rate-distortion performance and computational complexity of linear transforms for lossy data compression. The goal is to better understand the performance/complexity tradeoffs associated with using the Karhunen-Loeve transform (KLT) and its fast approximations. Since the optimal transform for transform coding is unknown in general, we investigate the performance penalties associated with using the KLT by examining cases where the KLT fails, developing a new transform that corrects the KLT's failures in those examples, and then empirically testing the performance difference between this new transform and the KLT. Experiments demonstrate that while the worst KLT can yield transform coding performance at least 3 dB worse than that of alternative block transforms, the performance penalty associated with using the KLT on real data sets seems to be significantly smaller, giving at most 0.5 dB difference in our experiments. The KLT and its fast variations studied here range in complexity requirements from O(n^2) to O(n log n) in coding vectors of dimension n. We empirically investigate the rate-distortion performance tradeoffs associated with traversing this range of options. For example, an algorithm with complexity O(n^3/2) and memory O(n) gives 0.4 dB performance loss relative to the full KLT in our image compression experiment

    Perceptually-Driven Video Coding with the Daala Video Codec

    Full text link
    The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201

    Directional Transforms for Video Coding Based on Lifting on Graphs

    Get PDF
    In this work we describe and optimize a general scheme based on lifting transforms on graphs for video coding. A graph is constructed to represent the video signal. Each pixel becomes a node in the graph and links between nodes represent similarity between them. Therefore, spatial neighbors and temporal motion-related pixels can be linked, while nonsimilar pixels (e.g., pixels across an edge) may not be. Then, a lifting-based transform, in which filterin operations are performed using linked nodes, is applied to this graph, leading to a 3-dimensional (spatio-temporal) directional transform which can be viewed as an extension of wavelet transforms for video. The design of the proposed scheme requires four main steps: (i) graph construction, (ii) graph splitting, (iii) filte design, and (iv) extension of the transform to different levels of decomposition. We focus on the optimization of these steps in order to obtain an effective transform for video coding. Furthermore, based on this scheme, we propose a coefficien reordering method and an entropy coder leading to a complete video encoder that achieves better coding performance than a motion compensated temporal filterin wavelet-based encoder and a simple encoder derived from H.264/AVC that makes use of similar tools as our proposed encoder (reference software JM15.1 configu ed to use 1 reference frame, no subpixel motion estimation, 16 Ă— 16 inter and 4 Ă— 4 intra modes).This work was supported in part by NSF under grant CCF-1018977 and by Spanish Ministry of Economy and Competitiveness under grants TEC2014-53390-P and TEC2014-52289-R.Publicad

    3D high definition video coding on a GPU-based heterogeneous system

    Get PDF
    H.264/MVC is a standard for supporting the sensation of 3D, based on coding from 2 (stereo) to N views. H.264/MVC adopts many coding options inherited from single view H.264/AVC, and thus its complexity is even higher, mainly because the number of processing views is higher. In this manuscript, we aim at an efficient parallelization of the most computationally intensive video encoding module for stereo sequences. In particular, inter prediction and its collaborative execution on a heterogeneous platform. The proposal is based on an efficient dynamic load balancing algorithm and on breaking encoding dependencies. Experimental results demonstrate the proposed algorithm's ability to reduce the encoding time for different stereo high definition sequences. Speed-up values of up to 90Ă— were obtained when compared with the reference encoder on the same platform. Moreover, the proposed algorithm also provides a more energy-efficient approach and hence requires less energy than the sequential reference algorith
    • …
    corecore