24,869 research outputs found
Models for Paired Comparison Data: A Review with Emphasis on Dependent Data
Thurstonian and Bradley-Terry models are the most commonly applied models in
the analysis of paired comparison data. Since their introduction, numerous
developments have been proposed in different areas. This paper provides an
updated overview of these extensions, including how to account for object- and
subject-specific covariates and how to deal with ordinal paired comparison
data. Special emphasis is given to models for dependent comparisons. Although
these models are more realistic, their use is complicated by numerical
difficulties. We therefore concentrate on implementation issues. In particular,
a pairwise likelihood approach is explored for models for dependent paired
comparison data, and a simulation study is carried out to compare the
performance of maximum pairwise likelihood with other limited information
estimation methods. The methodology is illustrated throughout using a real data
set about university paired comparisons performed by students.Comment: Published in at http://dx.doi.org/10.1214/12-STS396 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Thurstonian Scaling of Compositional Questionnaire Data
To prevent response biases, personality questionnaires may use comparative response formats. These include forced choice, where respondents choose among a number of items, and quantitative comparisons, where respondents indicate the extent to which items are preferred to each other. The present article extends Thurstonian modeling of binary choice data (Brown & Maydeu-Olivares, 2011a) to âproportion-of-totalâ (compositional) formats. Following Aitchison (1982), compositional item data are transformed into log-ratios, conceptualized as differences of latent item utilities. The mean and covariance structure of the log-ratios is modelled using Confirmatory Factor Analysis (CFA), where the item utilities are first-order factors, and personal attributes measured by a questionnaire are second-order factors. A simulation study with two sample sizes, N=300 and N=1000, shows that the method provides very good recovery of true parameters and near-nominal rejection rates. The approach is illustrated with empirical data from N=317 students, comparing model parameters obtained with compositional and Likert scale versions of a Big Five measure. The results show that the proposed model successfully captures the latent structures and person scores on the measured traits
Beyond Classification: Latent User Interests Profiling from Visual Contents Analysis
User preference profiling is an important task in modern online social
networks (OSN). With the proliferation of image-centric social platforms, such
as Pinterest, visual contents have become one of the most informative data
streams for understanding user preferences. Traditional approaches usually
treat visual content analysis as a general classification problem where one or
more labels are assigned to each image. Although such an approach simplifies
the process of image analysis, it misses the rich context and visual cues that
play an important role in people's perception of images. In this paper, we
explore the possibilities of learning a user's latent visual preferences
directly from image contents. We propose a distance metric learning method
based on Deep Convolutional Neural Networks (CNN) to directly extract
similarity information from visual contents and use the derived distance metric
to mine individual users' fine-grained visual preferences. Through our
preliminary experiments using data from 5,790 Pinterest users, we show that
even for the images within the same category, each user possesses distinct and
individually-identifiable visual preferences that are consistent over their
lifetime. Our results underscore the untapped potential of finer-grained visual
preference profiling in understanding users' preferences.Comment: 2015 IEEE 15th International Conference on Data Mining Workshop
How to reduce the number of rating scale items without predictability loss?
Rating scales are used to elicit data about qualitative entities (e.g.,
research collaboration). This study presents an innovative method for reducing
the number of rating scale items without the predictability loss. The "area
under the receiver operator curve method" (AUC ROC) is used. The presented
method has reduced the number of rating scale items (variables) to 28.57\%
(from 21 to 6) making over 70\% of collected data unnecessary.
Results have been verified by two methods of analysis: Graded Response Model
(GRM) and Confirmatory Factor Analysis (CFA). GRM revealed that the new method
differentiates observations of high and middle scores. CFA proved that the
reliability of the rating scale has not deteriorated by the scale item
reduction. Both statistical analysis evidenced usefulness of the AUC ROC
reduction method.Comment: 14 pages, 5 figure
- âŠ