16 research outputs found

    Performance engineering for HEVC transform and quantization kernel on GPUs

    Get PDF
    Continuous growth of video traffic and video services, especially in the field of high resolution and high-quality video content, places heavy demands on video coding and its implementations. High Efficiency Video Coding (HEVC) standard doubles the compression efficiency of its predecessor H.264/AVC at the cost of high computational complexity. To address those computing issues high-performance video processing takes advantage of heterogeneous multiprocessor platforms. In this paper, we present a highly performance-optimized HEVC transform and quantization kernel with all-zero-block (AZB) identification designed for execution on a Graphics Processor Unit (GPU). Performance optimization strategy involved all three aspects of parallel design, exposing as much of the application’s intrinsic parallelism as possible, exploitation of high throughput memory and efficient instruction usage. It combines efficient mapping of transform blocks to thread-blocks and efficient vectorized access patterns to shared memory for all transform sizes supported in the standard. Two different GPUs of the same architecture were used to evaluate proposed implementation. Achieved processing times are 6.03 and 23.94 ms for DCI 4K and 8K Full Format, respectively. Speedup factors compared to CPU, cuBLAS and AVX2 implementations are up to 80, 19 and 4 times respectively. Proposed implementation outperforms previous work 1.22 times

    Rotate Intra Block Copy for Still Image Coding

    Get PDF
    This paper proposes a method called rotate intra block copy, which extends the intra block copy technique by making the block matching process invariant to rotation. HEVC intra prediction plus rotate intra block copy gives an average of 20% reduction in residual energy (i.e. prediction error) compared to HEVC intra prediction plus intra block copy. As the motion vector correlation in rotate intra block copy is different from the intra block copy, a new method of motion vector coding is presented. The impact of angular resolution on residual energy reduction is also evaluated. In a full codec pipeline, this reduction in residual energy translates into a coding gain in BD-rate of 3.4% over HEVC intra prediction plus intra block copy for both screen content and camera-captured gray scale images.Samsung (Firm). Global Research Outreach Progra

    Frequency-dependent perceptual quantisation for visually lossless compression applications

    Get PDF
    The default quantisation algorithms in the state-of-the-art High Efficiency Video Coding (HEVC) standard, namely Uniform Reconstruction Quantisation (URQ) and Rate-Distortion Optimised Quantisation (RDOQ), do not take into account the perceptual relevance of individual transform coefficients. In this paper, a Frequency-Dependent Perceptual Quantisation (FDPQ) technique for HEVC is proposed. FDPQ exploits the well-established Modulation Transfer Function (MTF) characteristics of the linear transformation basis functions by taking into account the Euclidean distance of an AC transform coefficient from the DC coefficient. As such, in luma and chroma Cb and Cr Transform Blocks (TBs), FDPQ quantises more coarsely the least perceptually relevant transform coefficients (i.e., the high frequency AC coefficients). Conversely, FDPQ preserves the integrity of the DC coefficient and the very low frequency AC coefficients. Compared with RDOQ, which is the most widely used transform coefficient-level quantisation technique in video coding, FDPQ successfully achieves bitrate reductions of up to 41%. Furthermore, the subjective evaluations confirm that the FDPQ-coded video data is perceptually indistinguishable (i.e., visually lossless) from the raw video data for a given Quantisation Parameter (QP)

    Visually lossless coding in HEVC : a high bit depth and 4:4:4 capable JND-based perceptual quantisation technique for HEVC

    Get PDF
    Due to the increasing prevalence of high bit depth and YCbCr 4:4:4 video data, it is desirable to develop a JND-based visually lossless coding technique which can account for high bit depth 4:4:4 data in addition to standard 8-bit precision chroma subsampled data. In this paper, we propose a Coding Block (CB)-level JND-based luma and chroma perceptual quantisation technique for HEVC named Pixel-PAQ. Pixel-PAQ exploits both luminance masking and chrominance masking to achieve JND-based visually lossless coding; the proposed method is compatible with high bit depth YCbCr 4:4:4 video data of any resolution. When applied to YCbCr 4:4:4 high bit depth video data, Pixel-PAQ can achieve vast bitrate reductions – of up to 75% (68.6% over four QP data points) – compared with a state-of-the-art luma-based JND method for HEVC named IDSQ. Moreover, the participants in the subjective evaluations confirm that visually lossless coding is successfully achieved by Pixel-PAQ (at a PSNR value of 28.04 dB in one test)

    Spectral-PQ : a novel spectral sensitivity-orientated perceptual compression technique for RGB 4:4:4 video data

    Get PDF
    There exists an intrinsic relationship between the spectral sensitivity of the Human Visual System (HVS) and colour perception; these intertwined phenomena are often overlooked in perceptual compression research. In general, most previously proposed visually lossless compression techniques exploit luminance (luma) masking including luma spatiotemporal masking, luma contrast masking and luma texture/edge masking. The perceptual relevance of color in a picture is often overlooked, which constitutes a gap in the literature. With regard to the spectral sensitivity phenomenon of the HVS, the color channels of raw RGB 4:4:4 data contain significant color-based psychovisual redundancies. These perceptual redundancies can be quantized via color channel-level perceptual quantization. In this paper, we propose a novel spatiotemporal visually lossless coding method named Spectral Perceptual Quantization (Spectral-PQ). With application for RGB 4:4:4 video data, Spectral-PQ exploits HVS spectral sensitivity-related color masking in addition to spatial masking and temporal masking; the proposed method operates at the Coding Block (CB) level and the Prediction Unit (PU) level in the HEVC standard. Spectral-PQ perceptually adjusts the Quantization Step Size (QStep) at the CB level if high variance spatial data in G, B and R CBs is detected and also if high motion vector magnitudes in PUs are detected. Compared with anchor 1 (HEVC HM 16.17 RExt), Spectral-PQ considerably reduces bitrates with a maximum reduction of approximately 81%. The Mean Opinion Score (MOS) in the subjective evaluations show that Spectral-PQ successfully achieves perceptually lossless quality

    Katseenseurannan sovellukset mielenkiintoisen alueen HEVC-pakkaukselle

    Get PDF
    The increase in video streaming services and video resolutions has exploded the volume of Internet video traffic. New video coding standards, such as High Efficiency Video Coding (HEVC) have been developed to mitigate this inevitable video data explosion with better compression. The aim of video coding is to reduce the video size while maintaining the best possible perceived quality. Region of Interest (ROI) encoding particularly addresses this objective by focusing on the areas that humans would pay the most attention at and encode them with higher quality than the non-ROI areas. Methods for finding the ROI, and video encoding in general, take advantage of the Human Visual System (HVS). Computational HVS models can be used for the ROI detection but all current state-of-the-art models are designed for still images. Eye tracking data can be used for creating and verifying these models, including models suitable for video, which in turn calls for a reliable way to collect eye tracking data. Eye tracking glasses allow the widest range of possible scenarios out of all eye tracking equipment. Therefore, the glasses are used in this work to collect eye tracking data from 41 different videos. The main contribution of this work is to present a real-time system using eye tracking data to enhance the perceived quality of the video. The proposed system makes use of video recorded from the scene camera of the eye tracking glasses and Kvazaar open-source HEVC encoder for video compression. The system was shown to provide better subjective quality over the native rate control algorithm of Kvazaar. The obtained results were evaluated with Eye tracking Weighted PSNR (EWPSNR) that represents the HVS better than traditional PSNR. The system is shown to achieve up to 33% bit rate reduction for the same EWPSNR and on average 5-10% reduction depending on the parameter set. Additionally, the encoding time is improved by 8-20%

    Towards visualization and searching :a dual-purpose video coding approach

    Get PDF
    In modern video applications, the role of the decoded video is much more than filling a screen for visualization. To offer powerful video-enabled applications, it is increasingly critical not only to visualize the decoded video but also to provide efficient searching capabilities for similar content. Video surveillance and personal communication applications are critical examples of these dual visualization and searching requirements. However, current video coding solutions are strongly biased towards the visualization needs. In this context, the goal of this work is to propose a dual-purpose video coding solution targeting both visualization and searching needs by adopting a hybrid coding framework where the usual pixel-based coding approach is combined with a novel feature-based coding approach. In this novel dual-purpose video coding solution, some frames are coded using a set of keypoint matches, which not only allow decoding for visualization, but also provide the decoder valuable feature-related information, extracted at the encoder from the original frames, instrumental for efficient searching. The proposed solution is based on a flexible joint Lagrangian optimization framework where pixel-based and feature-based processing are combined to find the most appropriate trade-off between the visualization and searching performances. Extensive experimental results for the assessment of the proposed dual-purpose video coding solution under meaningful test conditions are presented. The results show the flexibility of the proposed coding solution to achieve different optimization trade-offs, notably competitive performance regarding the state-of-the-art HEVC standard both in terms of visualization and searching performance.Em modernas aplicações de vídeo, o papel do vídeo decodificado é muito mais que simplesmente preencher uma tela para visualização. Para oferecer aplicações mais poderosas por meio de sinais de vídeo,é cada vez mais crítico não apenas considerar a qualidade do conteúdo objetivando sua visualização, mas também possibilitar meios de realizar busca por conteúdos semelhantes. Requisitos de visualização e de busca são considerados, por exemplo, em modernas aplicações de vídeo vigilância e comunicações pessoais. No entanto, as atuais soluções de codificação de vídeo são fortemente voltadas aos requisitos de visualização. Nesse contexto, o objetivo deste trabalho é propor uma solução de codificação de vídeo de propósito duplo, objetivando tanto requisitos de visualização quanto de busca. Para isso, é proposto um arcabouço de codificação em que a abordagem usual de codificação de pixels é combinada com uma nova abordagem de codificação baseada em features visuais. Nessa solução, alguns quadros são codificados usando um conjunto de pares de keypoints casados, possibilitando não apenas visualização, mas também provendo ao decodificador valiosas informações de features visuais, extraídas no codificador a partir do conteúdo original, que são instrumentais em aplicações de busca. A solução proposta emprega um esquema flexível de otimização Lagrangiana onde o processamento baseado em pixel é combinado com o processamento baseado em features visuais objetivando encontrar um compromisso adequado entre os desempenhos de visualização e de busca. Os resultados experimentais mostram a flexibilidade da solução proposta em alcançar diferentes compromissos de otimização, nomeadamente desempenho competitivo em relação ao padrão HEVC tanto em termos de visualização quanto de busca
    corecore