30 research outputs found

    Visually lossless coding in HEVC : a high bit depth and 4:4:4 capable JND-based perceptual quantisation technique for HEVC

    Get PDF
    Due to the increasing prevalence of high bit depth and YCbCr 4:4:4 video data, it is desirable to develop a JND-based visually lossless coding technique which can account for high bit depth 4:4:4 data in addition to standard 8-bit precision chroma subsampled data. In this paper, we propose a Coding Block (CB)-level JND-based luma and chroma perceptual quantisation technique for HEVC named Pixel-PAQ. Pixel-PAQ exploits both luminance masking and chrominance masking to achieve JND-based visually lossless coding; the proposed method is compatible with high bit depth YCbCr 4:4:4 video data of any resolution. When applied to YCbCr 4:4:4 high bit depth video data, Pixel-PAQ can achieve vast bitrate reductions – of up to 75% (68.6% over four QP data points) – compared with a state-of-the-art luma-based JND method for HEVC named IDSQ. Moreover, the participants in the subjective evaluations confirm that visually lossless coding is successfully achieved by Pixel-PAQ (at a PSNR value of 28.04 dB in one test)

    Low complexity in-loop perceptual video coding

    Get PDF
    The tradition of broadcast video is today complemented with user generated content, as portable devices support video coding. Similarly, computing is becoming ubiquitous, where Internet of Things (IoT) incorporate heterogeneous networks to communicate with personal and/or infrastructure devices. Irrespective, the emphasises is on bandwidth and processor efficiencies, meaning increasing the signalling options in video encoding. Consequently, assessment for pixel differences applies uniform cost to be processor efficient, in contrast the Human Visual System (HVS) has non-uniform sensitivity based upon lighting, edges and textures. Existing perceptual assessments, are natively incompatible and processor demanding, making perceptual video coding (PVC) unsuitable for these environments. This research allows existing perceptual assessment at the native level using low complexity techniques, before producing new pixel-base image quality assessments (IQAs). To manage these IQAs a framework was developed and implemented in the high efficiency video coding (HEVC) encoder. This resulted in bit-redistribution, where greater bits and smaller partitioning were allocated to perceptually significant regions. Using a HEVC optimised processor the timing increase was < +4% and < +6% for video streaming and recording applications respectively, 1/3 of an existing low complexity PVC solution. Future work should be directed towards perceptual quantisation which offers the potential for perceptual coding gain

    Video compression algorithms for HEVC and beyond

    Get PDF
    PhDDue to the increasing number of new services and devices that allow the creation, distribution and consumption of video content, the amount of video information being transmitted all over the world is constantly growing. Video compression technology is essential to cope with the ever increasing volume of digital video data being distributed in today's networks, as more e cient video compression techniques allow support for higher volumes of video data under the same memory/bandwidth constraints. This is especially relevant with the introduction of new and more immersive video formats associated with signi cantly higher amounts of data. In this thesis, novel techniques for improving the e ciency of current and future video coding technologies are investigated. Several aspects that in uence the way conventional video coding methods work are considered. In particular, the properties and limitations of the Human Visual System are exploited to tune the performance of video encoders towards better subjective quality. Additionally, it is shown how the visibility of speci c types of visual artefacts can be prevented during the video encoding process, in order to avoid subjective quality degradations in the compressed content. Techniques for higher video compression e ciency are also explored, targeting to improve the compression capabilities of state-of-the-art video coding standards. Finally, the application of video coding technologies to practical use-cases is considered. Accurate estimation models are devised to control the encoding time and bit rate associated with compressed video signals, in order to meet speci c encoding time and transmission time restrictions

    Saliency-Enabled Coding Unit Partitioning and Quantization Control for Versatile Video Coding

    Get PDF
    The latest video coding standard, versatile video coding (VVC), has greatly improved coding efficiency over its predecessor standard high efficiency video coding (HEVC), but at the expense of sharply increased complexity. In the context of perceptual video coding (PVC), the visual saliency model that utilizes the characteristics of the human visual system to improve coding efficiency has become a reliable method due to advances in computer performance and visual algorithms. In this paper, a novel VVC optimization scheme compliant PVC framework is proposed, which consists of fast coding unit (CU) partition algorithm and quantization control algorithm. Firstly, based on the visual saliency model, we proposed a fast CU division scheme, including the redetermination of the CU division depth by calculating Scharr operator and variance, as well as the executive decision for intra sub-partitions (ISP), to reduce the coding complexity. Secondly, a quantization control algorithm is proposed by adjusting the quantization parameter based on multi-level classification of saliency values at the CU level to reduce the bitrate. In comparison with the reference model, experimental results indicate that the proposed method can reduce about 47.19% computational complexity and achieve a bitrate saving of 3.68% on average. Meanwhile, the proposed algorithm has reasonable peak signal-to-noise ratio losses and nearly the same subjective perceptual quality

    Analysis of the perceptual quality performance of different HEVC coding tools

    Get PDF
    Each new video encoding standard includes encoding techniques that aim to improve the performance and quality of the previous standards. During the development of these techniques, PSNR was used as the main distortion metric. However, the PSNR metric does not consider the subjectivity of the human visual system, so that the performance of some coding tools is questionable from the perceptual point of view. To further explore this point, we have developed a detailed study about the perceptual sensibility of different HEVC video coding tools. In order to perform this study, we used some popular objective quality assessment metrics to measure the perceptual response of every single coding tool. The conclusion of this work will help to determine the set of HEVC coding tools that provides, in general, the best perceptual response

    Non-disruptive use of light fields in image and video processing

    Get PDF
    In the age of computational imaging, cameras capture not only an image but also data. This captured additional data can be best used for photo-realistic renderings facilitating numerous post-processing possibilities such as perspective shift, depth scaling, digital refocus, 3D reconstruction, and much more. In computational photography, the light field imaging technology captures the complete volumetric information of a scene. This technology has the highest potential to accelerate immersive experiences towards close-toreality. It has gained significance in both commercial and research domains. However, due to lack of coding and storage formats and also the incompatibility of the tools to process and enable the data, light fields are not exploited to its full potential. This dissertation approaches the integration of light field data to image and video processing. Towards this goal, the representation of light fields using advanced file formats designed for 2D image assemblies to facilitate asset re-usability and interoperability between applications and devices is addressed. The novel 5D light field acquisition and the on-going research on coding frameworks are presented. Multiple techniques for optimised sequencing of light field data are also proposed. As light fields contain complete 3D information of a scene, large amounts of data is captured and is highly redundant in nature. Hence, by pre-processing the data using the proposed approaches, excellent coding performance can be achieved.Im Zeitalter der computergestützten Bildgebung erfassen Kameras nicht mehr nur ein Bild, sondern vielmehr auch Daten. Diese erfassten Zusatzdaten lassen sich optimal für fotorealistische Renderings nutzen und erlauben zahlreiche Nachbearbeitungsmöglichkeiten, wie Perspektivwechsel, Tiefenskalierung, digitale Nachfokussierung, 3D-Rekonstruktion und vieles mehr. In der computergestützten Fotografie erfasst die Lichtfeld-Abbildungstechnologie die vollständige volumetrische Information einer Szene. Diese Technologie bietet dabei das größte Potenzial, immersive Erlebnisse zu mehr Realitätsnähe zu beschleunigen. Deshalb gewinnt sie sowohl im kommerziellen Sektor als auch im Forschungsbereich zunehmend an Bedeutung. Aufgrund fehlender Kompressions- und Speicherformate sowie der Inkompatibilität derWerkzeuge zur Verarbeitung und Freigabe der Daten, wird das Potenzial der Lichtfelder nicht voll ausgeschöpft. Diese Dissertation ermöglicht die Integration von Lichtfelddaten in die Bild- und Videoverarbeitung. Hierzu wird die Darstellung von Lichtfeldern mit Hilfe von fortschrittlichen für 2D-Bilder entwickelten Dateiformaten erarbeitet, um die Wiederverwendbarkeit von Assets- Dateien und die Kompatibilität zwischen Anwendungen und Geräten zu erleichtern. Die neuartige 5D-Lichtfeldaufnahme und die aktuelle Forschung an Kompressions-Rahmenbedingungen werden vorgestellt. Es werden zudem verschiedene Techniken für eine optimierte Sequenzierung von Lichtfelddaten vorgeschlagen. Da Lichtfelder die vollständige 3D-Information einer Szene beinhalten, wird eine große Menge an Daten, die in hohem Maße redundant sind, erfasst. Die hier vorgeschlagenen Ansätze zur Datenvorverarbeitung erreichen dabei eine ausgezeichnete Komprimierleistung

    Bitrate Ladder Prediction Methods for Adaptive Video Streaming: A Review and Benchmark

    Full text link
    HTTP adaptive streaming (HAS) has emerged as a widely adopted approach for over-the-top (OTT) video streaming services, due to its ability to deliver a seamless streaming experience. A key component of HAS is the bitrate ladder, which provides the encoding parameters (e.g., bitrate-resolution pairs) to encode the source video. The representations in the bitrate ladder allow the client's player to dynamically adjust the quality of the video stream based on network conditions by selecting the most appropriate representation from the bitrate ladder. The most straightforward and lowest complexity approach involves using a fixed bitrate ladder for all videos, consisting of pre-determined bitrate-resolution pairs known as one-size-fits-all. Conversely, the most reliable technique relies on intensively encoding all resolutions over a wide range of bitrates to build the convex hull, thereby optimizing the bitrate ladder for each specific video. Several techniques have been proposed to predict content-based ladders without performing a costly exhaustive search encoding. This paper provides a comprehensive review of various methods, including both conventional and learning-based approaches. Furthermore, we conduct a benchmark study focusing exclusively on various learning-based approaches for predicting content-optimized bitrate ladders across multiple codec settings. The considered methods are evaluated on our proposed large-scale dataset, which includes 300 UHD video shots encoded with software and hardware encoders using three state-of-the-art encoders, including AVC/H.264, HEVC/H.265, and VVC/H.266, at various bitrate points. Our analysis provides baseline methods and insights, which will be valuable for future research in the field of bitrate ladder prediction. The source code of the proposed benchmark and the dataset will be made publicly available upon acceptance of the paper

    Frequency-dependent perceptual quantisation for visually lossless compression applications

    Get PDF
    The default quantisation algorithms in the state-of-the-art High Efficiency Video Coding (HEVC) standard, namely Uniform Reconstruction Quantisation (URQ) and Rate-Distortion Optimised Quantisation (RDOQ), do not take into account the perceptual relevance of individual transform coefficients. In this paper, a Frequency-Dependent Perceptual Quantisation (FDPQ) technique for HEVC is proposed. FDPQ exploits the well-established Modulation Transfer Function (MTF) characteristics of the linear transformation basis functions by taking into account the Euclidean distance of an AC transform coefficient from the DC coefficient. As such, in luma and chroma Cb and Cr Transform Blocks (TBs), FDPQ quantises more coarsely the least perceptually relevant transform coefficients (i.e., the high frequency AC coefficients). Conversely, FDPQ preserves the integrity of the DC coefficient and the very low frequency AC coefficients. Compared with RDOQ, which is the most widely used transform coefficient-level quantisation technique in video coding, FDPQ successfully achieves bitrate reductions of up to 41%. Furthermore, the subjective evaluations confirm that the FDPQ-coded video data is perceptually indistinguishable (i.e., visually lossless) from the raw video data for a given Quantisation Parameter (QP)
    corecore