31 research outputs found

    Steerable Discrete Cosine Transform

    Get PDF
    In image compression, classical block-based separable transforms tend to be inefficient when image blocks contain arbitrarily shaped discontinuities. For this reason, transforms incorporating directional information are an appealing alternative. In this paper, we propose a new approach to this problem, namely a discrete cosine transform (DCT) that can be steered in any chosen direction. Such transform, called steerable DCT (SDCT), allows to rotate in a flexible way pairs of basis vectors, and enables precise matching of directionality in each image block, achieving improved coding efficiency. The optimal rotation angles for SDCT can be represented as solution of a suitable rate-distortion (RD) problem. We propose iterative methods to search such solution, and we develop a fully fledged image encoder to practically compare our techniques with other competing transforms. Analytical and numerical results prove that SDCT outperforms both DCT and state-of-the-art directional transforms

    Non-Predictive Multistage Lattice Vector Quantization Video Coding

    Get PDF

    A case study in identifying acceptable bitrates for human face recognition tasks

    Get PDF
    Face recognition from images or video footage requires a certain level of recorded image quality. This paper derives acceptable bitrates (relating to levels of compression and consequently quality) of footage with human faces, using an industry implementation of the standard H.264/MPEG-4 AVC and the Closed-Circuit Television (CCTV) recording systems on London buses. The London buses application is utilized as a case study for setting up a methodology and implementing suitable data analysis for face recognition from recorded footage, which has been degraded by compression. The majority of CCTV recorders on buses use a proprietary format based on the H.264/MPEG-4 AVC video coding standard, exploiting both spatial and temporal redundancy. Low bitrates are favored in the CCTV industry for saving storage and transmission bandwidth, but they compromise the image usefulness of the recorded imagery. In this context, usefulness is determined by the presence of enough facial information remaining in the compressed image to allow a specialist to recognize a person. The investigation includes four steps: (1) Development of a video dataset representative of typical CCTV bus scenarios. (2) Selection and grouping of video scenes based on local (facial) and global (entire scene) content properties. (3) Psychophysical investigations to identify the key scenes, which are most affected by compression, using an industry implementation of H.264/MPEG-4 AVC. (4) Testing of CCTV recording systems on buses with the key scenes and further psychophysical investigations. The results showed a dependency upon scene content properties. Very dark scenes and scenes with high levels of spatial–temporal busyness were the most challenging to compress, requiring higher bitrates to maintain useful information

    Enhanced error-resilient video transport over MIMO systems using multiple descriptions

    Get PDF
    International audienceExpectation Propagation (Minka, 2001) is a widely successful algorithm for variational inference. EP is an iterative algorithm that can be used to approximate complicated distributions, most often posterior distributions arising in Bayesian settings. Its most typical use is to find a Gaussian approximation to posterior distributions, and in many applications of this type, EP performs extremely well. Surprisingly, despite its widespread use, there are very few theoretical guarantees on Gaussian EP.A basic requirement of statistical inference methods is that they should perform well in the limit of infinite data, and here we show that it is indeed the case for EP. In the classical large data limit, where the Bernstein-von Mises theorem applies, we prove that EP is exact, meaning that it recovers the correct Gaussian posterior. We prove further that in the same limit EP behaves like a simpler algorithm we call averaged-EP (aEP), and in turn aEP behaves similarly to the Newton algorithm. This correspondence yields interesting insights into the dynamic behavior of EP, for example that it may diverge under poor initialization, just like the Newton algorithm. EP is a simple algorithm to state, but a difficult one to study. Our results should facilitate further research into the theoretical properties of this important method

    Enhanced error-resilient video transport over MIMO systems using multiple descriptions

    Get PDF

    HD-VideoBench: A benchmark for evaluating high definition digital video applications

    Get PDF
    HD-VideoBench is a benchmark devoted to high definition (HD) digital video processing. It includes a set of video encoders and decoders (Codecs) for the MPEG-2, MPEG-4 and H.264 video standards. The applications were carefully selected taken into account the quality and portability of the code, the representativeness of the video application domain, the availability of high performance optimizations and the distribution under a free license. Additionally, HD-VideoBench defines a set of input sequences and configuration parameters of the video Codecs which are appropriate for the HD video domain.Peer ReviewedPostprint (published version

    VideoWall Bench: A Benchmark for Evaluating Hardware Accelerated Video Decoding on Linux

    Get PDF
    VideoWall Bench is a benchmark script for benchmarking video decoding capabilities using hardware acceleration on Linux. Intel has introduced Video Acceleration API (VA-API) which enabled and provides access for graphics hardware to do hardware acceleration. VA API provides a set of video decoders (Codecs) for the H.264 video standards. Multiple video decoding using video wall methodology is a method of benchmarking that be implemented in this script. Using this method, users can really stress the multiple video decoding capabilities of one platform and at the same time measure processor usage for video decoding process. VideoWall Bench benchmark video decoding performance by measuring processor utilization, memory utilization, total frame rate per second (FPS) and time fluctuation in video decoding process. Additionally, VideoWall Bench also includes set

    Towards Hybrid-Optimization Video Coding

    Full text link
    Video coding is a mathematical optimization problem of rate and distortion essentially. To solve this complex optimization problem, two popular video coding frameworks have been developed: block-based hybrid video coding and end-to-end learned video coding. If we rethink video coding from the perspective of optimization, we find that the existing two frameworks represent two directions of optimization solutions. Block-based hybrid coding represents the discrete optimization solution because those irrelevant coding modes are discrete in mathematics. It searches for the best one among multiple starting points (i.e. modes). However, the search is not efficient enough. On the other hand, end-to-end learned coding represents the continuous optimization solution because the gradient descent is based on a continuous function. It optimizes a group of model parameters efficiently by the numerical algorithm. However, limited by only one starting point, it is easy to fall into the local optimum. To better solve the optimization problem, we propose to regard video coding as a hybrid of the discrete and continuous optimization problem, and use both search and numerical algorithm to solve it. Our idea is to provide multiple discrete starting points in the global space and optimize the local optimum around each point by numerical algorithm efficiently. Finally, we search for the global optimum among those local optimums. Guided by the hybrid optimization idea, we design a hybrid optimization video coding framework, which is built on continuous deep networks entirely and also contains some discrete modes. We conduct a comprehensive set of experiments. Compared to the continuous optimization framework, our method outperforms pure learned video coding methods. Meanwhile, compared to the discrete optimization framework, our method achieves comparable performance to HEVC reference software HM16.10 in PSNR
    corecore