35 research outputs found

    DCT and DST Filtering with Sparse Graph Operators

    Full text link
    Graph filtering is a fundamental tool in graph signal processing. Polynomial graph filters (PGFs), defined as polynomials of a fundamental graph operator, can be implemented in the vertex domain, and usually have a lower complexity than frequency domain filter implementations. In this paper, we focus on the design of filters for graphs with graph Fourier transform (GFT) corresponding to a discrete trigonometric transform (DTT), i.e., one of 8 types of discrete cosine transforms (DCT) and 8 discrete sine transforms (DST). In this case, we show that multiple sparse graph operators can be identified, which allows us to propose a generalization of PGF design: multivariate polynomial graph filter (MPGF). First, for the widely used DCT-II (type-2 DCT), we characterize a set of sparse graph operators that share the DCT-II matrix as their common eigenvector matrix. This set contains the well-known connected line graph. These sparse operators can be viewed as graph filters operating in the DCT domain, which allows us to approximate any DCT graph filter by a MPGF, leading to a design with more degrees of freedom than the conventional PGF approach. Then, we extend those results to all of the 16 DTTs as well as their 2D versions, and show how their associated sets of multiple graph operators can be determined. We demonstrate experimentally that ideal low-pass and exponential DCT/DST filters can be approximated with higher accuracy with similar runtime complexity. Finally, we apply our method to transform-type selection in a video codec, AV1, where we demonstrate significant encoding time savings, with a negligible compression loss.Comment: 16 pages, 11 figures, 5 table

    Luma/Chroma Component Wise Weighted Pixel Inter Prediction

    Get PDF
    After obtaining a prediction for a current block using inter-prediction, block adaptive local weighted prediction (BAWP) can be used to adjust the prediction based on luminance (i.e., brightness) and/or chroma (i.e., color) differences between the current block and its reference block therewith improving the prediction quality and reducing the differences, which in turn can reduce the prediction residuals that are encoded. Techniques that reduce the signaling costs associated with signaling BAWP for the luminance and chrominance components are described

    Side-information generation for temporally and spatially scalablewyner-ziv codecs

    Get PDF
    The distributed video coding paradigmenables video codecs to operate with reversed complexity, in which the complexity is shifted from the encoder toward the decoder. Its performance is heavily dependent on the quality of the side information generated by motio estimation at the decoder. We compare the rate-distortion performance of different side-information estimators, for both temporally and spatially scalableWyner-Ziv codecs. For the temporally scalable codec we compared an established method with a new algorithm that uses a linear-motion model to produce side-information. As a continuation of previous works, in this paper, we propose to use a super-resolution method to upsample the nonkey frame, for the spatial scalable codec, using the key frames as reference.We verify the performance of the spatial scalableWZcoding using the state-of-the-art video coding standard H.264/AVC

    Mediabeads: An architecture for Path-Enhanced Media applications

    Get PDF
    . Telephone: + Intl. 732-562-3966. Tagging digital media, such as photos and videos, with capture time and location information has previously been proposed to enhance its organization and presentation. We believe that the full path traveled during media capture, rather than just the media capture locations, provides a much richer context for understanding and "re-living" a trip experience, and offers many possibilities for novel applications. We introduce the concept of path-enhanced media, in which media is associated and stored together with a densely sampled path in time and space, and we present the MediaBeads architecture for capturing, representing, browsing, editing, presenting, and searching this data. The architecture includes, among other things, novel data representations, new algorithms for automatically building movie-like presentations of trips, and novel search applications
    corecore