8,511 research outputs found
Crowdsourcing evaluation of high dynamic range compression
Crowdsourcing is becoming a popular cost effective alternative to lab-based evaluations for subjective quality assessment. However, crowd-based evaluations are constrained by the limited availability of display devices used by typical online workers, which makes the evaluation of high dynamic range (HDR) content a challenging task. In this paper, we investigate the feasibility of using low dynamic range versions of original HDR content obtained with tone mapping operators (TMOs) in crowdsourcing evaluations. We conducted two crowdsourcing experiments by employing workers from Microworkers platform. In the first experiment, we evaluate five HDR images encoded at different bit rates with the upcoming JPEG XT coding standard. To find best suitable TMO, we create eleven tone-mapped versions of these five HDR images by using eleven different TMOs. The crowdsourcing results are compared to a reference ground truth obtained via a subjective assessment of the same HDR images on a Dolby `Pulsar' HDR monitor in a laboratory environment. The second crowdsourcing evaluation uses semantic differentiators to better understand the characteristics of eleven different TMOs. The crowdsourcing evaluations show that some TMOs are more suitable for evaluation of HDR image compression
Analysis of Crowdsourced Sampling Strategies for HodgeRank with Sparse Random Graphs
Crowdsourcing platforms are now extensively used for conducting subjective
pairwise comparison studies. In this setting, a pairwise comparison dataset is
typically gathered via random sampling, either \emph{with} or \emph{without}
replacement. In this paper, we use tools from random graph theory to analyze
these two random sampling methods for the HodgeRank estimator. Using the
Fiedler value of the graph as a measurement for estimator stability
(informativeness), we provide a new estimate of the Fiedler value for these two
random graph models. In the asymptotic limit as the number of vertices tends to
infinity, we prove the validity of the estimate. Based on our findings, for a
small number of items to be compared, we recommend a two-stage sampling
strategy where a greedy sampling method is used initially and random sampling
\emph{without} replacement is used in the second stage. When a large number of
items is to be compared, we recommend random sampling with replacement as this
is computationally inexpensive and trivially parallelizable. Experiments on
synthetic and real-world datasets support our analysis
Exploring Outliers in Crowdsourced Ranking for QoE
Outlier detection is a crucial part of robust evaluation for crowdsourceable
assessment of Quality of Experience (QoE) and has attracted much attention in
recent years. In this paper, we propose some simple and fast algorithms for
outlier detection and robust QoE evaluation based on the nonconvex optimization
principle. Several iterative procedures are designed with or without knowing
the number of outliers in samples. Theoretical analysis is given to show that
such procedures can reach statistically good estimates under mild conditions.
Finally, experimental results with simulated and real-world crowdsourcing
datasets show that the proposed algorithms could produce similar performance to
Huber-LASSO approach in robust ranking, yet with nearly 8 or 90 times speed-up,
without or with a prior knowledge on the sparsity size of outliers,
respectively. Therefore the proposed methodology provides us a set of helpful
tools for robust QoE evaluation with crowdsourcing data.Comment: accepted by ACM Multimedia 2017 (Oral presentation). arXiv admin
note: text overlap with arXiv:1407.763
Subjective Annotation for a Frame Interpolation Benchmark using Artefact Amplification
Current benchmarks for optical flow algorithms evaluate the estimation either
directly by comparing the predicted flow fields with the ground truth or
indirectly by using the predicted flow fields for frame interpolation and then
comparing the interpolated frames with the actual frames. In the latter case,
objective quality measures such as the mean squared error are typically
employed. However, it is well known that for image quality assessment, the
actual quality experienced by the user cannot be fully deduced from such simple
measures. Hence, we conducted a subjective quality assessment crowdscouring
study for the interpolated frames provided by one of the optical flow
benchmarks, the Middlebury benchmark. We collected forced-choice paired
comparisons between interpolated images and corresponding ground truth. To
increase the sensitivity of observers when judging minute difference in paired
comparisons we introduced a new method to the field of full-reference quality
assessment, called artefact amplification. From the crowdsourcing data, we
reconstructed absolute quality scale values according to Thurstone's model. As
a result, we obtained a re-ranking of the 155 participating algorithms w.r.t.
the visual quality of the interpolated frames. This re-ranking not only shows
the necessity of visual quality assessment as another evaluation metric for
optical flow and frame interpolation benchmarks, the results also provide the
ground truth for designing novel image quality assessment (IQA) methods
dedicated to perceptual quality of interpolated images. As a first step, we
proposed such a new full-reference method, called WAE-IQA. By weighing the
local differences between an interpolated image and its ground truth WAE-IQA
performed slightly better than the currently best FR-IQA approach from the
literature.Comment: arXiv admin note: text overlap with arXiv:1901.0536
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
Geometric reasoning via internet crowdsourcing
The ability to interpret and reason about shapes is a peculiarly human capability that has proven difficult to reproduce algorithmically. So despite the fact that geometric modeling technology has made significant advances in the representation, display and modification of shapes, there have only been incremental advances in geometric reasoning. For example, although today's CAD systems can confidently identify isolated cylindrical holes, they struggle with more ambiguous tasks such as the identification of partial symmetries or similarities in arbitrary geometries. Even well defined problems such as 2D shape nesting or 3D packing generally resist elegant solution and rely instead on brute force explorations of a subset of the many possible solutions. Identifying economic ways to solving such problems would result in significant productivity gains across a wide range of industrial applications. The authors hypothesize that Internet Crowdsourcing might provide a pragmatic way of removing many geometric reasoning bottlenecks.This paper reports the results of experiments conducted with Amazon's mTurk site and designed to determine the feasibility of using Internet Crowdsourcing to carry out geometric reasoning tasks as well as establish some benchmark data for the quality, speed and costs of using this approach.After describing the general architecture and terminology of the mTurk Crowdsourcing system, the paper details the implementation and results of the following three investigations; 1) the identification of "Canonical" viewpoints for individual shapes, 2) the quantification of "similarity" relationships with-in collections of 3D models and 3) the efficient packing of 2D Strips into rectangular areas. The paper concludes with a discussion of the possibilities and limitations of the approach
- …