301 research outputs found
Study of saliency in objective video quality assessment
Reliably predicting video quality as perceived by humans remains challenging and is of high practical relevance. A significant research trend is to investigate visual saliency and its implications for video quality assessment. Fundamental problems regarding how to acquire reliable eye-tracking data for the purpose of video quality research and how saliency should be incorporated in objective video quality metrics (VQMs) are largely unsolved. In this paper, we propose a refined methodology for reliably collecting eye-tracking data, which essentially eliminates bias induced by each subject having to view multiple variations of the same scene in a conventional experiment. We performed a large-scale eye-tracking experiment that involved 160 human observers and 160 video stimuli distorted with different distortion types at various degradation levels. The measured saliency was integrated into several best known VQMs in the literature. With the assurance of the reliability of the saliency data, we thoroughly assessed the capabilities of saliency in improving the performance of VQMs, and devised a novel approach for optimal use of saliency in VQMs. We also evaluated to what extent the state-of-the-art computational saliency models can improve VQMs in comparison to the improvement achieved by using “ground truth” eye-tracking data. The eye-tracking database is made publicly available to the research community
Congestion Control for Network-Aware Telehaptic Communication
Telehaptic applications involve delay-sensitive multimedia communication
between remote locations with distinct Quality of Service (QoS) requirements
for different media components. These QoS constraints pose a variety of
challenges, especially when the communication occurs over a shared network,
with unknown and time-varying cross-traffic. In this work, we propose a
transport layer congestion control protocol for telehaptic applications
operating over shared networks, termed as dynamic packetization module (DPM).
DPM is a lossless, network-aware protocol which tunes the telehaptic
packetization rate based on the level of congestion in the network. To monitor
the network congestion, we devise a novel network feedback module, which
communicates the end-to-end delays encountered by the telehaptic packets to the
respective transmitters with negligible overhead. Via extensive simulations, we
show that DPM meets the QoS requirements of telehaptic applications over a wide
range of network cross-traffic conditions. We also report qualitative results
of a real-time telepottery experiment with several human subjects, which reveal
that DPM preserves the quality of telehaptic activity even under heavily
congested network scenarios. Finally, we compare the performance of DPM with
several previously proposed telehaptic communication protocols and demonstrate
that DPM outperforms these protocols.Comment: 25 pages, 19 figure
SpatioTemporal Feature Integration and Model Fusion for Full Reference Video Quality Assessment
Perceptual video quality assessment models are either frame-based or
video-based, i.e., they apply spatiotemporal filtering or motion estimation to
capture temporal video distortions. Despite their good performance on video
quality databases, video-based approaches are time-consuming and harder to
efficiently deploy. To balance between high performance and computational
efficiency, Netflix developed the Video Multi-method Assessment Fusion (VMAF)
framework, which integrates multiple quality-aware features to predict video
quality. Nevertheless, this fusion framework does not fully exploit temporal
video quality measurements which are relevant to temporal video distortions. To
this end, we propose two improvements to the VMAF framework: SpatioTemporal
VMAF and Ensemble VMAF. Both algorithms exploit efficient temporal video
features which are fed into a single or multiple regression models. To train
our models, we designed a large subjective database and evaluated the proposed
models against state-of-the-art approaches. The compared algorithms will be
made available as part of the open source package in
https://github.com/Netflix/vmaf
Video Packet Priority Assignment Based On Spatio-Temporal Perceptual Importance
A novel perceptually motivated two-stage algorithm for assigning priority to video packet data to
be transmitted over the internet is proposed. Priority assignment is based on temporal and spatial
features that are derived from low-level vision concepts. The motivation for a two-stage design is to
be able to handle different application settings. The first stage of the algorithm is computationally
very efficient and can be directly used in low-delay applications with limited computational resources.
The two-stage method performs exceedingly well across a variety of content and can be used in less
restrictive operating settings. The efficacy of the proposed algorithm (both stages) is demonstrated
using an intelligent packet drop application where it is compared with cumulative mean squared
error (cMSE) based priority assignment and random packet dropping. The proposed prioritization
algorithm allows for packet drops that result in significantly lower perceptual annoyance at the
receiver relative to the other methods considered
Scalable image quality assessment with 2D mel-cepstrum and machine learning approach
Cataloged from PDF version of article.Measurement of image quality is of fundamental importance to numerous image and video processing applications. Objective image quality assessment (IQA) is a two-stage process comprising of the following: (a) extraction of important information and discarding the redundant one, (b) pooling the detected features using appropriate weights. These two stages are not easy to tackle due to the complex nature of the human visual system (HVS). In this paper, we first investigate image features based on two-dimensional (20) mel-cepstrum for the purpose of IQA. It is shown that these features are effective since they can represent the structural information, which is crucial for IQA. Moreover, they are also beneficial in a reduced-reference scenario where only partial reference image information is used for quality assessment. We address the second issue by exploiting machine learning. In our opinion, the well established methodology of machine learning/pattern recognition has not been adequately used for IQA so far; we believe that it will be an effective tool for feature pooling since the required weights/parameters can be determined in a more convincing way via training with the ground truth obtained according to subjective scores. This helps to overcome the limitations of the existing pooling methods, which tend to be over simplistic and lack theoretical justification. Therefore, we propose a new metric by formulating IQA as a pattern recognition problem. Extensive experiments conducted using six publicly available image databases (totally 3211 images with diverse distortions) and one video database (with 78 video sequences) demonstrate the effectiveness and efficiency of the proposed metric, in comparison with seven relevant existing metrics. (C) 2011 Elsevier Ltd. All rights reserved
Texture Structure Analysis
abstract: Texture analysis plays an important role in applications like automated pattern inspection, image and video compression, content-based image retrieval, remote-sensing, medical imaging and document processing, to name a few. Texture Structure Analysis is the process of studying the structure present in the textures. This structure can be expressed in terms of perceived regularity. Our human visual system (HVS) uses the perceived regularity as one of the important pre-attentive cues in low-level image understanding. Similar to the HVS, image processing and computer vision systems can make fast and efficient decisions if they can quantify this regularity automatically. In this work, the problem of quantifying the degree of perceived regularity when looking at an arbitrary texture is introduced and addressed. One key contribution of this work is in proposing an objective no-reference perceptual texture regularity metric based on visual saliency. Other key contributions include an adaptive texture synthesis method based on texture regularity, and a low-complexity reduced-reference visual quality metric for assessing the quality of synthesized textures. In order to use the best performing visual attention model on textures, the performance of the most popular visual attention models to predict the visual saliency on textures is evaluated. Since there is no publicly available database with ground-truth saliency maps on images with exclusive texture content, a new eye-tracking database is systematically built. Using the Visual Saliency Map (VSM) generated by the best visual attention model, the proposed texture regularity metric is computed. The proposed metric is based on the observation that VSM characteristics differ between textures of differing regularity. The proposed texture regularity metric is based on two texture regularity scores, namely a textural similarity score and a spatial distribution score. In order to evaluate the performance of the proposed regularity metric, a texture regularity database called RegTEX, is built as a part of this work. It is shown through subjective testing that the proposed metric has a strong correlation with the Mean Opinion Score (MOS) for the perceived regularity of textures. The proposed method is also shown to be robust to geometric and photometric transformations and outperforms some of the popular texture regularity metrics in predicting the perceived regularity. The impact of the proposed metric to improve the performance of many image-processing applications is also presented. The influence of the perceived texture regularity on the perceptual quality of synthesized textures is demonstrated through building a synthesized textures database named SynTEX. It is shown through subjective testing that textures with different degrees of perceived regularities exhibit different degrees of vulnerability to artifacts resulting from different texture synthesis approaches. This work also proposes an algorithm for adaptively selecting the appropriate texture synthesis method based on the perceived regularity of the original texture. A reduced-reference texture quality metric for texture synthesis is also proposed as part of this work. The metric is based on the change in perceived regularity and the change in perceived granularity between the original and the synthesized textures. The perceived granularity is quantified through a new granularity metric that is proposed in this work. It is shown through subjective testing that the proposed quality metric, using just 2 parameters, has a strong correlation with the MOS for the fidelity of synthesized textures and outperforms the state-of-the-art full-reference quality metrics on 3 different texture databases. Finally, the ability of the proposed regularity metric in predicting the perceived degradation of textures due to compression and blur artifacts is also established.Dissertation/ThesisPh.D. Electrical Engineering 201
- …