676 research outputs found
Perceptual quality of 4K-resolution video content compared to HD
With the introduction of 4K UHD video and display resolution, questions arise on the perceptual differences between 4K UHD and upsampled HD video content. In this paper, a striped pair comparison has been performed on a diverse set of 4K UHD video sources. The goal was to subjectively assess the perceived sharpness of 4K UHD and downscaled/upscaled HD video. A striped pair comparison has been applied in order to make the test as straightforward as possible for a non-expert participant population. Under these conditions and over this set of sequences, on average, on 54.8% of the sequences (17 out of 31), 4K UHD resolution content could be identified as being sharper compared to its HD down and upsampled alternative. The probabilities in which 4K UHD could be differentiated from downscaled/upscaled HD range from 83.3% for the easiest to assess sequence down to 39.7% for the most difficult sequence. Although significance tests demonstrate there is a positive sharpness difference from camera quality 4K UHD content compared to the HD downscaled/upscaled variations, it is very content dependent and all circumstances have been chosen in favor of the 4K UHD representation. The results of this test can contribute to the research process of developing metrics indicating visibility of high resolution features within specific content
Adaptive Quantization Matrices for HD and UHD Display Resolutions in Scalable HEVC
HEVC contains an option to enable custom quantization matrices, which are
designed based on the Human Visual System and a 2D Contrast Sensitivity
Function. Visual Display Units, capable of displaying video data at High
Definition and Ultra HD display resolutions, are frequently utilized on a
global scale. Video compression artifacts that are present due to high levels
of quantization, which are typically inconspicuous in low display resolution
environments, are clearly visible on HD and UHD video data and VDUs. The
default QM technique in HEVC does not take into account the video data
resolution, nor does it take into consideration the associated display
resolution of a VDU to determine the appropriate levels of quantization
required to reduce unwanted video compression artifacts. Based on this fact, we
propose a novel, adaptive quantization matrix technique for the HEVC standard,
including Scalable HEVC. Our technique, which is based on a refinement of the
current HVS-CSF QM approach in HEVC, takes into consideration the display
resolution of the target VDU for the purpose of minimizing video compression
artifacts. In SHVC SHM 9.0, and compared with anchors, the proposed technique
yields important quality and coding improvements for the Random Access
configuration, with a maximum of 56.5% luma BD-Rate reductions in the
enhancement layer. Furthermore, compared with the default QMs and the Sony QMs,
our method yields encoding time reductions of 0.75% and 1.19%, respectively.Comment: Data Compression Conference 201
Quality of Experience in UHD-1 Phase 2 television: the contribution of UHD+ HFR technology
A key factor to determine the quality of experience (QoE) of a video is its capability to convey the large spectrum of perceptual phenomena that our eyes can sense in real life. In order to meet this demand, the recent DVB UHD-1 Phase 2 specification employs new video features, such as higher spatial resolutions (4K/8K) and High Frame Rate (HFR). The first enables larger field of view and level of details, while the second offers sharper images of moving objects going well beyond the current frame rates. While the contribution of each of these technologies to QoE has been investigated individually, in this paper we are interested to study their interaction, and in quantifying the benefits to users from their combination. To this end, we conduct a subjective test on compressed UHD+HFR content on a recent display capable of reproducing 100 pictures per second at 2160p resolution, with the goal to assess the increase in QoE of UHD and HFR with respect to conventional video, both individually and in combination. The results indicate that for content with fast motion, at higher bitrates the combination of UHD and HFR significantly improves the QoE compared to that obtained when these features are used individually
The quality of experience of emerging display technologies
As new display technologies emerge and become part of everyday life, the understanding of the visual experience they provide becomes more relevant. The cognition of perception is the most vital component of visual experience; however, it is not the only cognition that contributes to the complex overall experience of the end-user. Expectations can create significant cognitive bias that may even override what the user
genuinely perceives. Even if a visualization technology is somewhat novel, expectations can be fuelled by prior experiences gained from using similar displays and, more importantly, even a single word or an acronym may induce serious preconceptions, especially if such word suggests excellence in quality. In this interdisciplinary Ph.D. thesis, the effect of minimal, one-word labels on the Quality of Experience (QoE) is investigated in a series of subjective tests. In the studies carried out on an ultra-high-definition (UHD) display, UHD video contents
were directly compared to their HD counterparts, with and without labels explicitly informing the test participants about the resolution of each stimulus. The experiments on High Dynamic Range (HDR) visualization addressed the effect of the word “premium” on the quality aspects of HDR video, and also how this may affect the perceived duration of stalling events. In order to support the findings,
additional tests were carried out comparing the stalling detection thresholds of HDR video with conventional Low Dynamic Range (LDR) video. The third emerging technology addressed by this thesis is light field visualization. Due to its novel nature and the lack of comprehensive, exhaustive research on the QoE of light field displays and content parameters at the time of this thesis, instead
of investigating the labeling effect, four phases of subjective studies were performed on light field QoE. The first phases started with fundamental research, and the experiments progressed towards the concept and evaluation of the dynamic adaptive streaming of light field video, introduced in the final phase
PEA265: Perceptual Assessment of Video Compression Artifacts
The most widely used video encoders share a common hybrid coding framework
that includes block-based motion estimation/compensation and block-based
transform coding. Despite their high coding efficiency, the encoded videos
often exhibit visually annoying artifacts, denoted as Perceivable Encoding
Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience
(QoE) of end users. To monitor and improve visual QoE, it is crucial to develop
subjective and objective measures that can identify and quantify various types
of PEAs. In this work, we make the first attempt to build a large-scale
subjectlabelled database composed of H.265/HEVC compressed videos containing
various PEAs. The database, namely the PEA265 database, includes 4 types of
spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types
of temporal PEAs (i.e. flickering and floating). Each containing at least
60,000 image or video patches with positive and negative labels. To objectively
identify these PEAs, we train Convolutional Neural Networks (CNNs) using the
PEA265 database. It appears that state-of-theart ResNeXt is capable of
identifying each type of PEAs with high accuracy. Furthermore, we define PEA
pattern and PEA intensity measures to quantify PEA levels of compressed video
sequence. We believe that the PEA265 database and our findings will benefit the
future development of video quality assessment methods and perceptually
motivated video encoders.Comment: 10 pages,15 figures,4 table
Data-driven visual quality estimation using machine learning
Heutzutage werden viele visuelle Inhalte erstellt und sind zugänglich, was auf Verbesserungen der Technologie wie Smartphones und das Internet zurückzuführen ist. Es ist daher notwendig, die von den Nutzern wahrgenommene Qualität zu bewerten, um das Erlebnis weiter zu verbessern. Allerdings sind nur wenige der aktuellen Qualitätsmodelle speziell für höhere Auflösungen konzipiert, sagen mehr als nur den Mean Opinion Score vorher oder nutzen maschinelles Lernen. Ein Ziel dieser Arbeit ist es, solche maschinellen Modelle für höhere Auflösungen mit verschiedenen Datensätzen zu trainieren und zu evaluieren. Als Erstes wird eine objektive Analyse der Bildqualität bei höheren Auflösungen durchgeführt. Die Bilder wurden mit Video-Encodern komprimiert, hierbei weist AV1 die beste Qualität und Kompression auf. Anschließend werden die Ergebnisse eines Crowd-Sourcing-Tests mit einem Labortest bezüglich Bildqualität verglichen. Weiterhin werden auf Deep Learning basierende Modelle für die Vorhersage von Bild- und Videoqualität beschrieben. Das auf Deep Learning basierende Modell ist aufgrund der benötigten Ressourcen für die Vorhersage der Videoqualität in der Praxis nicht anwendbar. Aus diesem Grund werden pixelbasierte Videoqualitätsmodelle vorgeschlagen und ausgewertet, die aussagekräftige Features verwenden, welche Bild- und Bewegungsaspekte abdecken. Diese Modelle können zur Vorhersage von Mean Opinion Scores für Videos oder sogar für anderer Werte im Zusammenhang mit der Videoqualität verwendet werden, wie z.B. einer Bewertungsverteilung. Die vorgestellte Modellarchitektur kann auf andere Videoprobleme angewandt werden, wie z.B. Videoklassifizierung, Vorhersage der Qualität von Spielevideos, Klassifikation von Spielegenres oder der Klassifikation von Kodierungsparametern. Ein wichtiger Aspekt ist auch die Verarbeitungszeit solcher Modelle. Daher wird ein allgemeiner Ansatz zur Beschleunigung von State-of-the-Art-Videoqualitätsmodellen vorgestellt, der zeigt, dass ein erheblicher Teil der Verarbeitungszeit eingespart werden kann, während eine ähnliche Vorhersagegenauigkeit erhalten bleibt. Die Modelle sind als Open Source veröffentlicht, so dass die entwickelten Frameworks für weitere Forschungsarbeiten genutzt werden können. Außerdem können die vorgestellten Ansätze als Bausteine für neuere Medienformate verwendet werden.Today a lot of visual content is accessible and produced, due to improvements in technology such as smartphones and the internet. This results in a need to assess the quality perceived by users to further improve the experience. However, only a few of the state-of-the-art quality models are specifically designed for higher resolutions, predict more than mean opinion score, or use machine learning. One goal of the thesis is to train and evaluate such machine learning models of higher resolutions with several datasets. At first, an objective evaluation of image quality in case of higher resolutions is performed. The images are compressed using video encoders, and it is shown that AV1 is best considering quality and compression. This evaluation is followed by the analysis of a crowdsourcing test in comparison with a lab test investigating image quality. Afterward, deep learning-based models for image quality prediction and an extension for video quality are proposed. However, the deep learning-based video quality model is not practically usable because of performance constrains. For this reason, pixel-based video quality models using well-motivated features covering image and motion aspects are proposed and evaluated. These models can be used to predict mean opinion scores for videos, or even to predict other video quality-related information, such as a rating distributions. The introduced model architecture can be applied to other video problems, such as video classification, gaming video quality prediction, gaming genre classification or encoding parameter estimation. Furthermore, one important aspect is the processing time of such models. Hence, a generic approach to speed up state-of-the-art video quality models is introduced, which shows that a significant amount of processing time can be saved, while achieving similar prediction accuracy. The models have been made publicly available as open source so that the developed frameworks can be used for further research. Moreover, the presented approaches may be usable as building blocks for newer media formats
- …