Search CORE

21,757 research outputs found

Robust Modeling Using Non-Elliptically Contoured Multivariate t Distributions

Author: Ding Peng
Jiang Zhichao
Publication venue
Publication date: 01/01/2016
Field of study

Models based on multivariate t distributions are widely applied to analyze data with heavy tails. However, all the marginal distributions of the multivariate t distributions are restricted to have the same degrees of freedom, making these models unable to describe different marginal heavy-tailedness. We generalize the traditional multivariate t distributions to non-elliptically contoured multivariate t distributions, allowing for different marginal degrees of freedom. We apply the non-elliptically contoured multivariate t distributions to three widely-used models: the Heckman selection model with different degrees of freedom for selection and outcome equations, the multivariate Robit model with different degrees of freedom for marginal responses, and the linear mixed-effects model with different degrees of freedom for random effects and within-subject errors. Based on the Normal mixture representation of our t distribution, we propose efficient Bayesian inferential procedures for the model parameters based on data augmentation and parameter expansion. We show via simulation studies and real examples that the conclusions are sensitive to the existence of different marginal heavy-tailedness

arXiv.org e-Print Archive

Combining multiple observational data sources to estimate causal effects

Author: Ding Peng
Yang Shu
Publication venue
Publication date: 20/04/2019
Field of study

The era of big data has witnessed an increasing availability of multiple data sources for statistical analyses. We consider estimation of causal effects combining big main data with unmeasured confounders and smaller validation data with supplementary information on these confounders. Under the unconfoundedness assumption with completely observed confounders, the smaller validation data allow for constructing consistent estimators for causal effects, but the big main data can only give error-prone estimators in general. However, by leveraging the information in the big main data in a principled way, we can improve the estimation efficiencies yet preserve the consistencies of the initial estimators based solely on the validation data. Our framework applies to asymptotically normal estimators, including the commonly-used regression imputation, weighting, and matching estimators, and does not require a correct specification of the model relating the unmeasured confounders to the observed variables. We also propose appropriate bootstrap procedures, which makes our method straightforward to implement using software routines for existing estimators

arXiv.org e-Print Archive

eScholarship - University of California

FigShare

To Adjust or Not to Adjust? Sensitivity Analysis of M-Bias and Butterfly-Bias

Author: Ding Peng
Miratrix Luke
Publication venue
Publication date: 01/08/2014
Field of study

"M-Bias," as it is called in the epidemiologic literature, is the bias introduced by conditioning on a pretreatment covariate due to a particular "M-Structure" between two latent factors, an observed treatment, an outcome, and a "collider." This potential source of bias, which can occur even when the treatment and the outcome are not confounded, has been a source of considerable controversy. We here present formulae for identifying under which circumstances biases are inflated or reduced. In particular, we show that the magnitude of M-Bias in linear structural equation models tends to be relatively small compared to confounding bias, suggesting that it is generally not a serious concern in many applied settings. These theoretical results are consistent with recent empirical findings from simulation studies. We also generalize the M-Bias setting (1) to allow for the correlation between the latent factors to be nonzero, and (2) to allow for the collider to be a confounder between the treatment and the outcome. These results demonstrate that mild deviations from the M-Structure tend to increase confounding bias more rapidly than M-Bias, suggesting that choosing to condition on any given covariate is generally the superior choice. As an application, we re-examine a controversial example between Professors Donald Rubin and Judea Pearl.Comment: Journal of Causal Inference 201

arXiv.org e-Print Archive

CiteSeerX