27 research outputs found

    HDR-ChipQA: No-Reference Quality Assessment on High Dynamic Range Videos

    Full text link
    We present a no-reference video quality model and algorithm that delivers standout performance for High Dynamic Range (HDR) videos, which we call HDR-ChipQA. HDR videos represent wider ranges of luminances, details, and colors than Standard Dynamic Range (SDR) videos. The growing adoption of HDR in massively scaled video networks has driven the need for video quality assessment (VQA) algorithms that better account for distortions on HDR content. In particular, standard VQA models may fail to capture conspicuous distortions at the extreme ends of the dynamic range, because the features that drive them may be dominated by distortions {that pervade the mid-ranges of the signal}. We introduce a new approach whereby a local expansive nonlinearity emphasizes distortions occurring at the higher and lower ends of the {local} luma range, allowing for the definition of additional quality-aware features that are computed along a separate path. These features are not HDR-specific, and also improve VQA on SDR video contents, albeit to a reduced degree. We show that this preprocessing step significantly boosts the power of distortion-sensitive natural video statistics (NVS) features when used to predict the quality of HDR content. In similar manner, we separately compute novel wide-gamut color features using the same nonlinear processing steps. We have found that our model significantly outperforms SDR VQA algorithms on the only publicly available, comprehensive HDR database, while also attaining state-of-the-art performance on SDR content

    Making Video Quality Assessment Models Robust to Bit Depth

    Full text link
    We introduce a novel feature set, which we call HDRMAX features, that when included into Video Quality Assessment (VQA) algorithms designed for Standard Dynamic Range (SDR) videos, sensitizes them to distortions of High Dynamic Range (HDR) videos that are inadequately accounted for by these algorithms. While these features are not specific to HDR, and also augment the equality prediction performances of VQA models on SDR content, they are especially effective on HDR. HDRMAX features modify powerful priors drawn from Natural Video Statistics (NVS) models by enhancing their measurability where they visually impact the brightest and darkest local portions of videos, thereby capturing distortions that are often poorly accounted for by existing VQA models. As a demonstration of the efficacy of our approach, we show that, while current state-of-the-art VQA models perform poorly on 10-bit HDR databases, their performances are greatly improved by the inclusion of HDRMAX features when tested on HDR and 10-bit distorted videos.Comment: Published in IEEE Signal Processing Letters 202

    HDR or SDR? A Subjective and Objective Study of Scaled and Compressed Videos

    Full text link
    We conducted a large-scale study of human perceptual quality judgments of High Dynamic Range (HDR) and Standard Dynamic Range (SDR) videos subjected to scaling and compression levels and viewed on three different display devices. HDR videos are able to present wider color gamuts, better contrasts, and brighter whites and darker blacks than SDR videos. While conventional expectations are that HDR quality is better than SDR quality, we have found subject preference of HDR versus SDR depends heavily on the display device, as well as on resolution scaling and bitrate. To study this question, we collected more than 23,000 quality ratings from 67 volunteers who watched 356 videos on OLED, QLED, and LCD televisions. Since it is of interest to be able to measure the quality of videos under these scenarios, e.g. to inform decisions regarding scaling, compression, and SDR vs HDR, we tested several well-known full-reference and no-reference video quality models on the new database. Towards advancing progress on this problem, we also developed a novel no-reference model called HDRPatchMAX, that uses both classical and bit-depth sensitive distortion statistics more accurately than existing metrics

    NETWORKING FOR IMMERSIVE TELEPRESENCE: ARCHITECTURES AND PROTOCOLS- A CASE STUDY

    No full text
    Immersive telepresence allows an observer to view a remote scene from any viewpoint of choice and thus give a sense of presence, as opposed to conventional video viewing where the user can only view content from the viewpoint of a single camera. We specifically consider a depth-based rendering approach for enabling telepresence. In this paper, we explore various networking architectures in which telepresence can be enabled over the Internet using this approach. From these architectures, we derive a set of requirements for describing and conducting a telepresence session. Based on these requirements, we present a session protocol that draws upon the features of RTSP and SDP and extends them. Various media streams from a single camera viewpoint are aggregated as a view group and the concept of this view group is used to support the various architectures. Both unicast and multicast configurations, with central or distributed servers, are supported by this protocol. Finally, we present the overall end-to-end architecture used in our current implementation. 1

    The Video Z-buffer: A Concept for Facilitating Monoscopic Image Compression by exploiting the 3-D Stereoscopic Depth map

    No full text
    Compression can be achieved by exploiting knowledge both internal and external to a given image or video source. In this paper, we present means for generating and exploiting the specific external knowledge of a 3D stereoscopic depth map of the given scene to compress the given monoscopic source. Several instances in which the depth map can potentially increase compression or provide improved functionality are presented to motivate further work along this line of reasoning. 1. INTRODUCTION The primary goal of any image (2D) or video (2D+t) compression algorithm is to generate a representation of the source that is smaller than the source's raw bitmap. The ability to compactly represent the source implies knowledge, about the source content, that is in some sense deeper than the raw pixel intensity arrays. In this paper we distinguish between the internal and external knowledge about the source. Both of them facilitate description, and thus enable compression. Internal knowledge is kn..

    The Video Z-buffer: A Concept for Facilitating Monoscopic Image Compression by Exploiting the

    No full text
    Compression can be achieved by exploiting knowledge both internal and external to a given image or video source. In this paper, we present means for generating and exploiting the specific external knowledge of a 3D stereoscopic depth map of the given scene to compress the given monoscopic source. Several instances in which the depth map can potentially increase compression or provide improved functionality are presented to motivate further work along this line of reasoning. 1

    Multiresolution Based Hierarchical Disparity Estimation for Stereo Image Pair Compression

    No full text
    this paper a multiresolution based approach is proposed for compressing `still' stereo image pairs. In Section II the task at hand is contrasted with the stereo disparity estimation problem in the machine vision community; a block based scheme on the lines of a motion estimation scheme is suggested as a possible approach. In Section III, the suitability of hierarchical techniques for disparity estimation is outlined. Section IV provides an overview of wavelet decomposition. Section V details the multiresolution approach taken. In section VI, the typical computational gains and compression ratios possible with this scheme are computed. Subjective and objective evaluations of several different compressed stereo image pairs highlight the efficacy of the proposed compression scheme. Possible extensions of this approach to stereo image sequence compression are discussed in the last section

    Segmentation Based Coding Of Stereoscopic Image Sequences

    No full text
    A binocular disparity based segmentation scheme to compactly represent one image of a stereoscopic image pair given the other image was proposed earlier by us. That scheme adapted the excess bitcount, needed to code the additional image, to the binocular disparity detail present in the image pair. This paper addresses the issue of extending such a segmentation in the temporal dimension to achieve efficient stereoscopic sequence compression. The easiest conceivable temporal extension would be to code one of the sequences using an MPEG-type scheme while the frames of the other stream are coded based on the segmentation. However such independent compression of one of the streams fails to take advantage of the segmentation or the additional disparity information available. To achieve better compression by exploiting this additional information, we propose the following scheme. Each frame in one of the streams is segmented based on disparity. An MPEG-type frame structure is used for motion..
    corecore