33 research outputs found
Survey of Web-based Crowdsourcing Frameworks for Subjective Quality Assessment
The popularity of the crowdsourcing for performing various tasks online increased significantly in the past few years. The low cost and flexibility of crowdsourcing, in particular, attracted researchers in the field of subjective multimedia evaluations and Quality of Experience (QoE). Since online assessment of multimedia content is challenging, several dedicated frameworks were created to aid in the designing of the tests, including the support of the testing methodologies like ACR, DCR, and PC, setting up the tasks, training sessions, screening of the subjects, and storage of the resulted data. In this paper, we focus on the web-based frameworks for multimedia quality assessments that support commonly used crowdsourcing platforms such as Amazon Mechanical Turk and Microworkers. We provide a detailed overview of the crowdsourcing frameworks and evaluate them to aid researchers in the field of QoE assessment in the selection of frameworks and crowdsourcing platforms that are adequate for their experiments
Best Practices and Recommendations for Crowdsourced QoE - Lessons learned from the Qualinet Task Force Crowdsourcing
Crowdsourcing is a popular approach that outsources tasks via the Internet to a large number of users. Commercial crowdsourcing platforms provide a global pool of users employed for performing short and simple online tasks. For quality assessment of multimedia services and applications, crowdsourcing enables new possibilities by moving the subjective test into the crowd resulting in larger diversity of the test subjects, faster turnover of test campaigns, and reduced costs due to low reimbursement costs of the participants. Further, crowdsourcing allows easily addressing additional features like real-life environments. This white paper summarizes the recommendations and best practices for crowdsourced quality assessment of multimedia applications from the Qualinet Task Force on “Crowdsourcing”. The European Network on Quality of Experience in Multimedia Systems and Services Qualinet (COST Action IC 1003, see www.qualinet.eu) established this task force in 2012 which has more than 30 members. The recommendation paper resulted from the experience in designing, implementing, and conducting crowdsourcing experiments as well as the analysis of the crowdsourced user ratings and context data
Hybrid No-Reference Video Quality Metric Based on Multiway PLSR
In real-life applications, no-reference metrics are more useful than full-reference metrics. To design such metrics, we apply data analysis methods to objectively measurable features and to data originating from subjective testing. Unfortunately, the information about temporal variation of quality is often lost due to the temporal pooling over all frames. Instead of using temporal pooling, we have recently designed a H.264/AVC bitstream no-reference video quality metric employing multiway Partial Least Squares Regression (PLSR), which leads to an improved prediction performance. In this contribution we will utilize multiway PLSR to design a hybrid metric that combines both bitstream-based features with pixel-based features. Our results show that the additional inclusion of the pixel-based features improves the quality prediction even further
Influence of viewing experience and stabilization phase in subjective video testing
In this contribution, we will examine two important aspects of subjective video quality assessment and their overall influence on the test results in detail: the participants' viewing experience and the quality range in the stabilization phase. Firstly, we examined if the previous viewing experience of participants in subjective tests influence the results. We performed a number of single- and double-stimulus tests assessing the visual quality of video material compressed with both H.264/AVC and MPEG2 not only at different quality levels and content, but also in different video formats from 576i up to 1080i. During these tests, we collected additional statistical data on the test participants. Overall, we were able to collect data from over 70 different subjects and analyse the influence of the subjects' viewing experience on the results of the tests. Secondly, we examined if the visual quality range presented in the stabilization phase of a subjective test has significant influence on the test results. Due to time constraints, it is sometimes necessary to split a test into multiple sessions representing subsets of the overall quality range. Consequently, we examine the influence of the quality range presented in the stabilization phase on the overall results, depending on the quality subsets included in the stabilization phase
Video Quality Evaluation in the Cloud
Video quality evaluation with subjective testing is both time consuming and expensive. An interesting new approach to traditional testing is the so-called crowdsourcing, moving the testing effort into the internet. The QualityCrowd framework allows codec independent, crowd-based video quality assessment with a simple web interface, usable with common web browsers. However, due to its codec independent approach, the framework can pose high bandwidth requirements on the coordinating server. We therefore propose in this contribution a cloud-based extension of the QualityCrowd framework in order to perform subjective quality evaluation as a cloud application. Moreover, this allows us to access an even larger pool of potential participants due to the improved connectivity. We compare the results from an online subjective test using this framework with the results from a test in a standardized environment. This comparison shows that QualityCrowd delivers equivalent results within the acceptable inter-lab correlation
Improving the prediction accuracy of PSNR by simple temporal pooling
PSNR is still one of the most often and universally used visual
quality metrics. Although it is not very well suited to describe
the human perception of visual quality, its simplicity and familiarity
lead to its extensive use in many applications. We
propose to improve the predication accuracy of PSNR by simple
temporal pooling and thus not only using the mean PSNR,
but also to exploit other statistical properties. In order to support
this approach, we conducted extensive subjective testing
of HDTV video sequences at typical bit rates for consumer
and broadcasting applications. Using temporal pooling, we
were able to achieve an improvement of nearly 10 % in the
predication accuracy of PSNR for visual quality while not increasing
the computational complexity significantly. Also this
approach may be extendible to other frame-based metrics
Improving the prediction accuracy of video quality metrics
To improve the prediction accuracy of visual quality metrics for video we propose two simple steps: temporal pooling in order to gain a set of parameters from one measured feature and a correction step using videos of known visual quality. We demonstrate this approach on the well known PSNR. Firstly, we achieve a more accurate quality prediction by replacing the mean luma PSNR by alternative PSNR-based parameters. Secondly, we exploit the almost linear relationship between the output of a quality metric and the subjectively perceived visual quality for individual video sequences. We do this by estimating the parameters of this linear relationship with the help of additionally generated videos of known visual quality. Moreover, we show that this is also true for very different coding technologies. Also we used cross validation to verify our results. Combining these two steps, we achieve for a set of four different high definition videos an increase of the Pearson correlation coefficient from 0.69 to 0.88 for PSNR, outperforming other, more sophisticated full-reference video quality metrics
Visual Quality of Current Coding Technologies at High Definition IPTV Bitrates
High definition video over IP based networks (IPTV)
has become a mainstay in today’s consumer environment. In most
applications, encoders conforming to the H.264/AVC standard are
used. But even within one standard, often a wide range of coding
tools are available that can deliver a vastly different visual quality.
Therefore we evaluate in this contribution different coding
technologies, using different encoder settings of H.264/AVC, but
also a completely different encoder like Dirac. We cover a wide
range of different bitrates from ADSL to VDSL and different
content, with low and high demand on the encoders. As PSNR
is not well suited to describe the perceived visual quality, we
conducted extensive subject tests to determine the visual quality.
Our results show that for currently common bitrates, the visual
quality can be more than doubled, if the same coding technology,
but different coding tools are used
Entwurf von Videoqualitätsmetriken mit Multi-way Datenanalyse
In the conventional design approach to video quality metrics, the temporal nature of video is often considered only inadequately and also knowledge about the human visual system is required that is not readily available. In this thesis, I therefore propose a data-driven design methodology using multi-way data analysis for the design of video quality metrics that not only allows for the appropriate consideration of the temporal nature of video, but also leads to an increased prediction performance.Designansätze für Videoqualitätsmetriken erfordern oft nicht nur ein umfassendes, meist nicht verfügbares Verständnis der menschlichen Wahrnehmung, sondern vernachlässigen ebenso oft den zeitlichen Charakter von Video. In dieser Dissertation wird deshalb ein datenorientierter Designansatz vorgestellt, der unter Verwendung von multi-way Datenanalyse nicht nur den zeitlichen Charakter von Video angemessen berücksichtigt, sondern darüber hinaus auch eine verbesserte Qualitätsbestimmung ermöglicht