553 research outputs found

    Bayesian Item Response Modeling in R with brms and Stan

    Get PDF
    Item Response Theory (IRT) is widely applied in the human sciences to model persons' responses on a set of items measuring one or more latent constructs. While several R packages have been developed that implement IRT models, they tend to be restricted to respective prespecified classes of models. Further, most implementations are frequentist while the availability of Bayesian methods remains comparably limited. We demonstrate how to use the R package brms together with the probabilistic programming language Stan to specify and fit a wide range of Bayesian IRT models using flexible and intuitive multilevel formula syntax. Further, item and person parameters can be related in both a linear or non-linear manner. Various distributions for categorical, ordinal, and continuous responses are supported. Users may even define their own custom response distribution for use in the presented framework. Common IRT model classes that can be specified natively in the presented framework include 1PL and 2PL logistic models optionally also containing guessing parameters, graded response and partial credit ordinal models, as well as drift diffusion models of response times coupled with binary decisions. Posterior distributions of item and person parameters can be conveniently extracted and post-processed. Model fit can be evaluated and compared using Bayes factors and efficient cross-validation procedures.Comment: 54 pages, 16 figures, 3 table

    Biaxial Dynamic Fatigue Tests of Wind Turbine Blades

    Get PDF
    Testing rotor blades of wind turbines is essential to mitigate financial risks caused by serial damages. Present day uniaxial dynamic tests are time consuming and often inaccurate regarding the applied loading. This thesis proposes a faster fatigue test method by loading the two primary directions at the same time. In addition, a more realistic test, compared to uniaxial tests, is accomplished by loading larger areas of the blade cross-sections. To achieve this, an elliptical biaxial dynamic excitation is used. To fulfill the industry requirement for cost effective tests, a relatively simple test setup was developed, still achieving an elliptical dynamic excitation of the rotor blade. Two methods for an accurate determination of the applied loadings for dynamic fatigue tests are described. These calibration tests use easily measured values and simple analysis to achieve accurate test load measurements in a cost-effective way.German Federal Ministry of Nature Conservation and Nuclear Safety (BMU)/Better Blade/FKZ 0325169/E

    Rank-normalization, folding, and localization: An improved R^\widehat{R} for assessing convergence of MCMC

    Full text link
    Markov chain Monte Carlo is a key computational tool in Bayesian statistics, but it can be challenging to monitor the convergence of an iterative stochastic algorithm. In this paper we show that the convergence diagnostic R^\widehat{R} of Gelman and Rubin (1992) has serious flaws. Traditional R^\widehat{R} will fail to correctly diagnose convergence failures when the chain has a heavy tail or when the variance varies across the chains. In this paper we propose an alternative rank-based diagnostic that fixes these problems. We also introduce a collection of quantile-based local efficiency measures, along with a practical approach for computing Monte Carlo error estimates for quantiles. We suggest that common trace plots should be replaced with rank plots from multiple chains. Finally, we give recommendations for how these methods should be used in practice.Comment: Minor revision for improved clarit

    Posterior accuracy and calibration under misspecification in Bayesian generalized linear models

    Full text link
    Generalized linear models (GLMs) are popular for data-analysis in almost all quantitative sciences, but the choice of likelihood family and link function is often difficult. This motivates the search for likelihoods and links that minimize the impact of potential misspecification. We perform a large-scale simulation study on double-bounded and lower-bounded response data where we systematically vary both true and assumed likelihoods and links. In contrast to previous studies, we also study posterior calibration and uncertainty metrics in addition to point-estimate accuracy. Our results indicate that certain likelihoods and links can be remarkably robust to misspecification, performing almost on par with their respective true counterparts. Additionally, normal likelihood models with identity link (i.e., linear regression) often achieve calibration comparable to the more structurally faithful alternatives, at least in the studied scenarios. On the basis of our findings, we provide practical suggestions for robust likelihood and link choices in GLMs

    Rank-normalization, folding, and localization: An improved R^\widehat{R} for assessing convergence of MCMC

    Get PDF
    Markov chain Monte Carlo is a key computational tool in Bayesian statistics, but it can be challenging to monitor the convergence of an iterative stochastic algorithm. In this paper we show that the convergence diagnostic R^\widehat{R} of Gelman and Rubin (1992) has serious flaws. Traditional R^\widehat{R} will fail to correctly diagnose convergence failures when the chain has a heavy tail or when the variance varies across the chains. In this paper we propose an alternative rank-based diagnostic that fixes these problems. We also introduce a collection of quantile-based local efficiency measures, along with a practical approach for computing Monte Carlo error estimates for quantiles. We suggest that common trace plots should be replaced with rank plots from multiple chains. Finally, we give recommendations for how these methods should be used in practice.Comment: Minor revision for improved clarit

    Ambiguous avant-gardes and their geographies: on blank spots of the postgrowth debate

    Get PDF
    In the following article, the focus is on the transformative potentials created by so-called persistence avant-gardes and prevention innovators. The text extends Blühdorn’s guiding concept of narratives of hope (Blühdorn 2017; Blühdorn and Butzlaff 2019) by considering those groups that are marginalized within debates on socio-ecological transformation. With a closer look at the narratives of prevention and blockade that these actors engage, the ambiguous nature of postgrowth avant-gardes is carved out. Their discursive, argumentative, and effective inhibition of transitory policies is interpreted as a pro-active potential, rather than a mere obstacle to socio-ecological transformation. Adding a geographical perspective, the paper pleads for a more precise theoretical penetration of the ambivalent figure of avant-gardes when analyzing processes of local and regional postgrowth

    Testing for Publication Bias in Diagnostic Meta-Analysis: A Simulation Study

    Full text link
    The present study investigates the performance of several statistical tests to detect publication bias in diagnostic meta-analysis by means of simulation. While bivariate models should be used to pool data from primary studies in diagnostic meta-analysis, univariate measures of diagnostic accuracy are preferable for the purpose of detecting publication bias. In contrast to earlier research, which focused solely on the diagnostic odds ratio or its logarithm (lnω\ln\omega), the tests are combined with four different univariate measures of diagnostic accuracy. For each combination of test and univariate measure, both type I error rate and statistical power are examined under diverse conditions. The results indicate that tests based on linear regression or rank correlation cannot be recommended in diagnostic meta-analysis, because type I error rates are either inflated or power is too low, irrespective of the applied univariate measure. In contrast, the combination of trim and fill and lnω\ln\omega has non-inflated or only slightly inflated type I error rates and medium to high power, even under extreme circumstances (at least when the number of studies per meta-analysis is large enough). Therefore, we recommend the application of trim and fill combined with lnω\ln\omega to detect funnel plot asymmetry in diagnostic meta-analysis. Please cite this paper as published in Statistics in Medicine (https://doi.org/10.1002/sim.6177).Comment: arXiv admin note: text overlap with arXiv:2002.04775 by other author

    Prediction can be safely used as a proxy for explanation in causally consistent Bayesian generalized linear models

    Full text link
    Bayesian modeling provides a principled approach to quantifying uncertainty in model parameters and model structure and has seen a surge of applications in recent years. Within the context of a Bayesian workflow, we are concerned with model selection for the purpose of finding models that best explain the data, that is, help us understand the underlying data generating process. Since we rarely have access to the true process, all we are left with during real-world analyses is incomplete causal knowledge from sources outside of the current data and model predictions of said data. This leads to the important question of when the use of prediction as a proxy for explanation for the purpose of model selection is valid. We approach this question by means of large-scale simulations of Bayesian generalized linear models where we investigate various causal and statistical misspecifications. Our results indicate that the use of prediction as proxy for explanation is valid and safe only when the models under consideration are sufficiently consistent with the underlying causal structure of the true data generating process

    Graphical Test for Discrete Uniformity and its Applications in Goodness of Fit Evaluation and Multiple Sample Comparison

    Full text link
    Assessing goodness of fit to a given distribution plays an important role in computational statistics. The Probability integral transformation (PIT) can be used to convert the question of whether a given sample originates from a reference distribution into a problem of testing for uniformity. We present new simulation and optimization based methods to obtain simultaneous confidence bands for the whole empirical cumulative distribution function (ECDF) of the PIT values under the assumption of uniformity. Simultaneous confidence bands correspond to such confidence intervals at each point that jointly satisfy a desired coverage. These methods can also be applied in cases where the reference distribution is represented only by a finite sample. The confidence bands provide an intuitive ECDF-based graphical test for uniformity, which also provides useful information on the quality of the discrepancy. We further extend the simulation and optimization methods to determine simultaneous confidence bands for testing whether multiple samples come from the same underlying distribution. This multiple sample comparison test is especially useful in Markov chain Monte Carlo convergence diagnostics. We provide numerical experiments to assess the properties of the tests using both simulated and real world data and give recommendations on their practical application in computational statistics workflows
    corecore