    Second-Order Accurate Inference on Simple, Partial, and Multiple Correlations

    This article develops confidence interval procedures for functions of simple, partial, and squared multiple correlation coefficients. It is assumed that the observed multivariate data represent a random sample from a distribution that possesses infinite moments, but there is no requirement that the distribution be normal. The coverage error of conventional one-sided large sample intervals decreases at rate 1√n as n increases, where n is an index of sample size. The coverage error of the proposed intervals decreases at rate 1/n as n increases. The results of a simulation study that evaluates the performance of the proposed intervals is reported and the intervals are illustrated on a real data set

    Empirical study of correlated survival times for recurrent events with proportional hazards margins and the effect of correlation and censoring.

    Background: In longitudinal studies where subjects experience recurrent incidents over a period of time, such as respiratory infections, fever or diarrhea, statistical methods are required to take into account the within-subject correlation. Methods: For repeated events data with censored failure, the independent increment (AG), marginal (WLW) and conditional (PWP) models are three multiple failure models that generalize Cox"s proportional hazard model. In this paper, we revise the efficiency, accuracy and robustness of all three models under simulated scenarios with varying degrees of within-subject correlation, censoring levels, maximum number of possible recurrences and sample size. We also study the methods performance on a real dataset from a cohort study with bronchial obstruction. Results: We find substantial differences between methods and there is not an optimal method. AG and PWP seem to be preferable to WLW for low correlation levels but the situation reverts for high correlations. Conclusions: All methods are stable in front of censoring, worsen with increasing recurrence levels and share a bias problem which, among other consequences, makes asymptotic normal confidence intervals not fully reliable, although they are well developed theoretically

    Multiple imputation for continuous variables using a Bayesian principal component analysis

    We propose a multiple imputation method based on principal component analysis (PCA) to deal with incomplete continuous data. To reflect the uncertainty of the parameters from one imputation to the next, we use a Bayesian treatment of the PCA model. Using a simulation study and real data sets, the method is compared to two classical approaches: multiple imputation based on joint modelling and on fully conditional modelling. Contrary to the others, the proposed method can be easily used on data sets where the number of individuals is less than the number of variables and when the variables are highly correlated. In addition, it provides unbiased point estimates of quantities of interest, such as an expectation, a regression coefficient or a correlation coefficient, with a smaller mean squared error. Furthermore, the widths of the confidence intervals built for the quantities of interest are often smaller whilst ensuring a valid coverage.Comment: 16 page

    Simultaneous Inference in General Parametric Models

    Simultaneous inference is a common problem in many areas of application. If multiple null hypotheses are tested simultaneously, the probability of rejecting erroneously at least one of them increases beyond the pre-specified significance level. Simultaneous inference procedures have to be used which adjust for multiplicity and thus control the overall type I error rate. In this paper we describe simultaneous inference procedures in general parametric models, where the experimental questions are specified through a linear combination of elemental model parameters. The framework described here is quite general and extends the canonical theory of multiple comparison procedures in ANOVA models to linear regression problems, generalized linear models, linear mixed effects models, the Cox model, robust linear models, etc. Several examples using a variety of different statistical models illustrate the breadth of the results. For the analyses we use the R add-on package multcomp, which provides a convenient interface to the general approach adopted here

    A theoretical framework for estimation of AUCs in complete and incomplete sampling designs.

    Nonclinical in vivo animal studies have to be completed before starting clinical studies of the pharmacokinetic behavior of a drug in humans. The drug exposure in animal studies is often measured by the area under the concentration versus time curve (AUC). The classic complete data design, where each animal is sampled for analysis once per time point, is usually only applicable for large animals. In the case of rats and mice, where blood sampling is restricted, the batch design or the serial sampling design needs to be considered. In batch designs samples are taken more than once from each animal, but not at all time points. In serial sampling designs only one sample is taken from each animal. In this paper we present an estimator for the AUC from 0 to the last time point that is applicable to all three designs. The variance and asymptotic distribution of the estimator are derived and confidence intervals based upon the asymptotic results are discussed and evaluated in a simulation study. Further, we define an estimator for linear combinations of AUCs and investigate its asymptotic properties mathematically as well as in simulation

    Multiple Imputation Using Gaussian Copulas

    Missing observations are pervasive throughout empirical research, especially in the social sciences. Despite multiple approaches to dealing adequately with missing data, many scholars still fail to address this vital issue. In this paper, we present a simple-to-use method for generating multiple imputations using a Gaussian copula. The Gaussian copula for multiple imputation (Hoff, 2007) allows scholars to attain estimation results that have good coverage and small bias. The use of copulas to model the dependence among variables will enable researchers to construct valid joint distributions of the data, even without knowledge of the actual underlying marginal distributions. Multiple imputations are then generated by drawing observations from the resulting posterior joint distribution and replacing the missing values. Using simulated and observational data from published social science research, we compare imputation via Gaussian copulas with two other widely used imputation methods: MICE and Amelia II. Our results suggest that the Gaussian copula approach has a slightly smaller bias, higher coverage rates, and narrower confidence intervals compared to the other methods. This is especially true when the variables with missing data are not normally distributed. These results, combined with theoretical guarantees and ease-of-use suggest that the approach examined provides an attractive alternative for applied researchers undertaking multiple imputations

    Individual analysis of laterality data

    Graphical and statistical analyses are presented that allow one to check for an individual subject whether the performance during a session is stable. whether the difference between the left and the right visual half-field is significant. and whether the performance is uniform over different sessions. Analyses are given for accuracy data and for latency data. Though the analyses are described for a visual half-field experiment, they can easily be adapted for other laterality tasks
