14,223 research outputs found
A Potential Tale of Two by Two Tables from Completely Randomized Experiments
Causal inference in completely randomized treatment-control studies with
binary outcomes is discussed from Fisherian, Neymanian and Bayesian
perspectives, using the potential outcomes framework. A randomization-based
justification of Fisher's exact test is provided. Arguing that the crucial
assumption of constant causal effect is often unrealistic, and holds only for
extreme cases, some new asymptotic and Bayesian inferential procedures are
proposed. The proposed procedures exploit the intrinsic non-additivity of
unit-level causal effects, can be applied to linear and non-linear estimands,
and dominate the existing methods, as verified theoretically and also through
simulation studies
Boosted Beta regression.
Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures
Inference for binomial probability based on dependent Bernoulli random variables with applications to meta-analysis and group level studies
We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group-level studies or in meta-analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log-odds and arcsine transformations of the estimated probability inline image, both for single-group studies and in combining results from several groups or studies in meta-analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta-analysis and result in abysmal coverage of the combined effect for large K. We also propose bias-correction for the arcsine transformation. Our simulations demonstrate that this bias-correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta-analyses of prevalence
Standard survey methods for estimating colony losses and explanatory risk factors in Apis mellifera
This chapter addresses survey methodology and questionnaire design for the collection of data pertaining to estimation of honey bee colony loss rates and identification of risk factors for colony loss. Sources of error in surveys are described. Advantages and disadvantages of different random and non-random sampling strategies and different modes of data collection are presented to enable the researcher to make an informed choice. We discuss survey and questionnaire methodology in some detail, for the purpose of raising awareness of issues to be considered during the survey design stage in order to minimise error and bias in the results. Aspects of survey design are illustrated using surveys in Scotland. Part of a standardized questionnaire is given as a further example, developed by the COLOSS working group for Monitoring and Diagnosis. Approaches to data analysis are described, focussing on estimation of loss rates. Dutch monitoring data from 2012 were used for an example of a statistical analysis with the public domain R software. We demonstrate the estimation of the overall proportion of losses and corresponding confidence interval using a quasi-binomial model to account for extra-binomial variation. We also illustrate generalized linear model fitting when incorporating a single risk factor, and derivation of relevant confidence intervals
- …