2,741 research outputs found
A practical guide and software for analysing pairwise comparison experiments
Most popular strategies to capture subjective judgments from humans involve
the construction of a unidimensional relative measurement scale, representing
order preferences or judgments about a set of objects or conditions. This
information is generally captured by means of direct scoring, either in the
form of a Likert or cardinal scale, or by comparative judgments in pairs or
sets. In this sense, the use of pairwise comparisons is becoming increasingly
popular because of the simplicity of this experimental procedure. However, this
strategy requires non-trivial data analysis to aggregate the comparison ranks
into a quality scale and analyse the results, in order to take full advantage
of the collected data. This paper explains the process of translating pairwise
comparison data into a measurement scale, discusses the benefits and
limitations of such scaling methods and introduces a publicly available
software in Matlab. We improve on existing scaling methods by introducing
outlier analysis, providing methods for computing confidence intervals and
statistical testing and introducing a prior, which reduces estimation error
when the number of observers is low. Most of our examples focus on image
quality assessment.Comment: Code available at https://github.com/mantiuk/pwcm
When is it Better to Compare than to Score?
When eliciting judgements from humans for an unknown quantity, one often has
the choice of making direct-scoring (cardinal) or comparative (ordinal)
measurements. In this paper we study the relative merits of either choice,
providing empirical and theoretical guidelines for the selection of a
measurement scheme. We provide empirical evidence based on experiments on
Amazon Mechanical Turk that in a variety of tasks, (pairwise-comparative)
ordinal measurements have lower per sample noise and are typically faster to
elicit than cardinal ones. Ordinal measurements however typically provide less
information. We then consider the popular Thurstone and Bradley-Terry-Luce
(BTL) models for ordinal measurements and characterize the minimax error rates
for estimating the unknown quantity. We compare these minimax error rates to
those under cardinal measurement models and quantify for what noise levels
ordinal measurements are better. Finally, we revisit the data collected from
our experiments and show that fitting these models confirms this prediction:
for tasks where the noise in ordinal measurements is sufficiently low, the
ordinal approach results in smaller errors in the estimation
Ranking Models in Conjoint Analysis
In this paper we consider the estimation of probabilisticranking models in the context of conjoint experiments. By usingapproximate rather than exact ranking probabilities, we do notneed to compute high-dimensional integrals. We extend theapproximation technique proposed by \\citet{Henery1981} in theThurstone-Mosteller-Daniels model for any Thurstone orderstatistics model and we show that our approach allows for aunified approach. Moreover, our approach also allows for theanalysis of any partial ranking. Partial rankings are essentialin practical conjoint analysis to collect data efficiently torelieve respondents' task burden.conjoint experiments;partial rankings;thurstone order statistics model
Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence
Data in the form of pairwise comparisons arises in many domains, including
preference elicitation, sporting competitions, and peer grading among others.
We consider parametric ordinal models for such pairwise comparison data
involving a latent vector that represents the
"qualities" of the items being compared; this class of models includes the
two most widely used parametric models--the Bradley-Terry-Luce (BTL) and the
Thurstone models. Working within a standard minimax framework, we provide tight
upper and lower bounds on the optimal error in estimating the quality score
vector under this class of models. The bounds depend on the topology of
the comparison graph induced by the subset of pairs being compared via its
Laplacian spectrum. Thus, in settings where the subset of pairs may be chosen,
our results provide principled guidelines for making this choice. Finally, we
compare these error rates to those under cardinal measurement models and show
that the error rates in the ordinal and cardinal settings have identical
scalings apart from constant pre-factors.Comment: 39 pages, 5 figures. Significant extension of arXiv:1406.661
Thurstonian Scaling of Compositional Questionnaire Data
To prevent response biases, personality questionnaires may use comparative response formats. These include forced choice, where respondents choose among a number of items, and quantitative comparisons, where respondents indicate the extent to which items are preferred to each other. The present article extends Thurstonian modeling of binary choice data (Brown & Maydeu-Olivares, 2011a) to “proportion-of-total” (compositional) formats. Following Aitchison (1982), compositional item data are transformed into log-ratios, conceptualized as differences of latent item utilities. The mean and covariance structure of the log-ratios is modelled using Confirmatory Factor Analysis (CFA), where the item utilities are first-order factors, and personal attributes measured by a questionnaire are second-order factors. A simulation study with two sample sizes, N=300 and N=1000, shows that the method provides very good recovery of true parameters and near-nominal rejection rates. The approach is illustrated with empirical data from N=317 students, comparing model parameters obtained with compositional and Likert scale versions of a Big Five measure. The results show that the proposed model successfully captures the latent structures and person scores on the measured traits
Models for Paired Comparison Data: A Review with Emphasis on Dependent Data
Thurstonian and Bradley-Terry models are the most commonly applied models in
the analysis of paired comparison data. Since their introduction, numerous
developments have been proposed in different areas. This paper provides an
updated overview of these extensions, including how to account for object- and
subject-specific covariates and how to deal with ordinal paired comparison
data. Special emphasis is given to models for dependent comparisons. Although
these models are more realistic, their use is complicated by numerical
difficulties. We therefore concentrate on implementation issues. In particular,
a pairwise likelihood approach is explored for models for dependent paired
comparison data, and a simulation study is carried out to compare the
performance of maximum pairwise likelihood with other limited information
estimation methods. The methodology is illustrated throughout using a real data
set about university paired comparisons performed by students.Comment: Published in at http://dx.doi.org/10.1214/12-STS396 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Priorities in Thurstone Scaling and Steady-State Probabilities in Markov Stochastic Modeling
Thurstone scaling is widely used in marketing and advertising research where various methods of applied psychology are utilized. This article considers several analytical tools useful for positioning a set of items on a Thurstone scale via regression modeling and Markov stochastic processing in the form of Chapman-Kolmogorov equations. These approaches produce interval and ratio scales of preferences and enrich the possibilities of paired comparison estimation applied for solving practical problems of prioritization and probability of choice modeling
- …