Bayesian Methods in High-Dimensional Sparse Mediation Analysis

Abstract

Causal mediation analysis aims to examine the role of a mediator or a group of mediators that lie in the pathway between an exposure and an outcome. Recent biomedical studies often involve a large number of potential mediators, typically a large ensemble of biomarkers that are measured via high-throughput technologies. The goal of my dissertation is to develop novel statistical methods that can accommodate and leverage high-dimensional mediators in mediation analysis. We provide an overview of mediation analysis and an outline of our work in Chapter I. We elaborate our methodological developments in the following chapters. In Chapter II, we develop a Bayesian inference method using continuous shrinkage priors to simultaneously analyze high-dimensional mediators. Simulations demonstrate that our method improves the power of global mediation analysis compared to simpler alternatives and has decent performance to identify true non-null contributions to the mediation effects of the pathway. The Bayesian method also helps us to understand the structure of the composite null cases for inactive mediators in the pathway. We applied our method to Multi-Ethnic Study of Atherosclerosis (MESA) and identified DNA methylation regions that may actively mediate the effect of socioeconomic status (SES) on cardiometabolic outcomes. In Chapter III, we develop methods to directly perform targeted penalization of the natural indirect effect (NIE) in a Bayesian paradigm. Specifically, we develop two novel prior models for identification of the NIEs in high-dimensional mediation analysis, both with a joint distribution on the coefficients of the exposure-mediator and mediator-outcome models: (a) four-component Gaussian mixture prior, and (b) product threshold Gaussian prior. By jointly modeling the two parameters that contribute to the NIE, the proposed methods enable penalization on their product in a targeted way. Resultant inference can take into account the four-component composite structure underlying the indirect effect. We show through extensive simulations that the proposed methods improve both selection and estimation accuracy compared to other existing or alternative shrinkage/penalization based methods. We applied our methods to two ongoing epidemiological studies: the MESA and the LIFECODES birth cohort. The identified active mediators reveal important biological pathways that may be useful for understanding disease mechanism. In Chapter IV, we further extend the Gaussian mixture method in Chapter III to explicitly incorporate the useful correlation structural information among mediators in the model building process. Instead of assuming independent prior for each mediator as in our previous methods, we propose to (a) jointly model the mixing probabilities for correlated mediator selection, or (b) jointly model the group indicators by a Potts distribution, both adding the possible grouping effect across mediators through another layer in the Bayesian hierarchy. We develop efficient sampling algorithms under non-conjugate priors and large state space. Various simulations demonstrate that our methods enable effective identification of active mediators with high correlations, which could be missed using independent priors. The proposed methods also suggest new mediation findings in the LIFECODES and MESA data applications.PHDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163093/1/yanys_1.pd

    Similar works