47,893 research outputs found
Semi-parametric analysis of multi-rater data
Datasets that are subjectively labeled by a number of experts are becoming more common in tasks such as biological text annotation where class definitions are necessarily somewhat subjective. Standard classification and regression models are not suited to multiple labels and typically a pre-processing step (normally assigning the majority class) is performed. We propose Bayesian models for classification and ordinal regression that naturally incorporate multiple expert opinions in defining predictive distributions. The models make use of Gaussian process priors, resulting in great flexibility and particular suitability to text based problems where the number of covariates can be far greater than the number of data instances. We show that using all labels rather than just the majority improves performance on a recent biological dataset
Bayesian Item Response Modeling in R with brms and Stan
Item Response Theory (IRT) is widely applied in the human sciences to model
persons' responses on a set of items measuring one or more latent constructs.
While several R packages have been developed that implement IRT models, they
tend to be restricted to respective prespecified classes of models. Further,
most implementations are frequentist while the availability of Bayesian methods
remains comparably limited. We demonstrate how to use the R package brms
together with the probabilistic programming language Stan to specify and fit a
wide range of Bayesian IRT models using flexible and intuitive multilevel
formula syntax. Further, item and person parameters can be related in both a
linear or non-linear manner. Various distributions for categorical, ordinal,
and continuous responses are supported. Users may even define their own custom
response distribution for use in the presented framework. Common IRT model
classes that can be specified natively in the presented framework include 1PL
and 2PL logistic models optionally also containing guessing parameters, graded
response and partial credit ordinal models, as well as drift diffusion models
of response times coupled with binary decisions. Posterior distributions of
item and person parameters can be conveniently extracted and post-processed.
Model fit can be evaluated and compared using Bayes factors and efficient
cross-validation procedures.Comment: 54 pages, 16 figures, 3 table
- …