7 research outputs found

    MedZIM: Mediation analysis for Zero-Inflated Mediators with applications to microbiome data

    Full text link
    The human microbiome can contribute to the pathogenesis of many complex diseases such as cancer and Alzheimer's disease by mediating disease-leading causal pathways. However, standard mediation analysis is not adequate in the context of microbiome data due to the excessive number of zero values in the data. Zero-valued sequencing reads, commonly observed in microbiome studies, arise for technical and/or biological reasons. Mediation analysis approaches for analyzing zero-inflated mediators are still lacking largely because of challenges raised by the zero-inflated data structure: (a) disentangling the mediation effect induced by the point mass at zero; and (b) identifying the observed zero-valued data points that are actually not zero (i.e., false zeros). We develop a novel mediation analysis method under the potential-outcomes framework to fill this gap. We show that the mediation effect of the microbiome can be decomposed into two components that are inherent to the two-part nature of zero-inflated distributions. The first component corresponds to the mediation effect attributable to a unit-change over the positive relative abundance and the second component corresponds to the mediation effect attributable to discrete binary change of the mediator from zero to a non-zero state. With probabilistic models to account for observing zeros, we also address the challenge with false zeros. A comprehensive simulation study and the applications in two real microbiome studies demonstrate that our approach outperforms existing mediation analysis approaches.Comment: Corresponding: Zhigang L

    A Versatile and Efficient Novel Approach for Mendelian Randomization Analysis with Application to Assess the Causal Effect of Fetal Hemoglobin on Anemia in Sickle Cell Anemia

    Get PDF
    Mendelian randomization (MR) is increasingly employed as a technique to assess the causation of a risk factor on an outcome using observational data. The two-stage least-squares (2SLS) procedure is commonly used to examine the causation using genetic variants as the instrument variables. The validity of 2SLS relies on a representative sample randomly selected from a study cohort or a population for genome-wide association study (GWAS), which is not always true in practice. For example, the extreme phenotype sequencing (EPS) design is widely used to investigate genetic determinants of an outcome in GWAS as it bears many advantages such as efficiency, low sequencing or genotyping cost, and large power in detecting the involvement of rare genetic variants in disease etiology. In this paper, we develop a novel, versatile, and efficient approach, namely MR analysis under Extreme or random Phenotype Sampling (MREPS), for one-sample MR analysis based on samples drawn through either the random sampling design or the nonrandom EPS design. In simulations, MREPS provides unbiased estimates for causal effects, correct type I errors for causal effect testing. Furthermore, it is robust under different study designs and has high power. These results demonstrate the superiority of MREPS over the widely used standard 2SLS approach. We applied MREPS to assess and highlight the causal effect of total fetal hemoglobin on anemia risk in patients with sickle cell anemia using two independent cohort studies. A user-friendly Shiny app web interface was implemented for professionals to easily explore the MREPS

    Threshold Selection for High Dimensional Covariance Estimation

    No full text
    Thresholding is a regularization method commonly used for covariance estimation (Bickel and Levina, 2008, Cai and Liu, 2011), which provides consistent estimators in high-dimensional settings if the population covariance satisfies certain sparsity conditions. However, the performance of those estimators heavily depends on the threshold level. By minimizing the Frobenius risk of the adaptive thresholding covariance estimator, we conduct a theoretical study for the optimal threshold level, and obtain its analytical expression under a general setting of n and p. A consistent estimator based on this expression is proposed for the optimal threshold level, which is easy to implement in practice and efficient in computation. Numerical simulations and a case study on gene expression data are conducted to illustrate the proposed method. Based on the concepts developed in the theoretical study, another two efficient numerical methods are proposed for estimating the threshold level. These methods are more flexible and precise. As a result, they provide more precise and stable threshold levels by correctly adjusting to the true covariance structure, which enhances applicability in practice. Additional numerical simulations and a case study on different gene expression data are conducted to compare all proposed methods

    Real-time non-intrusive appliance load monitoring under supply voltage fluctuations

    No full text
    This paper presents a complete real-time implementation of a Non-Intrusive Appliance Load Monitoring (NIALM) system that, is robust under residential voltage level fluctuations. Existing NIALM techniques rely on multiple measurements taken at high sampling rates, but, only have been proven in simulated environments without even considering the effect of residential voltage level fluctuations - which is a severe problem in power systems of most developing countries like Sri Lanka. In contrast, through the NIALM method proposed in this paper, accurate load monitoring results were obtained in realtime using only smart meter measurements taken at a low sampling rate from a real appliance setup under residential voltage level fluctuations. In the proposed NIALM method, initially in the learning phase, a properly constructed MATLABTM Graphical User Interface (GUI) was used to acquire signals of each appliance active power consumption and voltage levels. Then, obtained active power measurements were separated into subspace components (SCs) via the Karhunen Loeve' Expansion (KLE) while also taking the voltage variations into account. Using those SCs, a unique information rich appliance level signature database was constructed and it was then used to obtain the signatures for all possible device combinations. Next, a separate GUI was designed to identify the turned ON appliance combination in the current time window using the pre-constructed signature databases, after reading the total residential active power consumption and the supply voltage. To validate the proposed real-time NIALM implementation, data from a laboratory arrangement consisting of ten household appliances was used. From the results, it was found that the proposed method is capable of accurately identifying the turned on appliances even under severe residential supply voltage level fluctuations

    A Versatile and Efficient Novel Approach for Mendelian Randomization Analysis with Application to Assess the Causal Effect of Fetal Hemoglobin on Anemia in Sickle Cell Anemia

    No full text
    Mendelian randomization (MR) is increasingly employed as a technique to assess the causation of a risk factor on an outcome using observational data. The two-stage least-squares (2SLS) procedure is commonly used to examine the causation using genetic variants as the instrument variables. The validity of 2SLS relies on a representative sample randomly selected from a study cohort or a population for genome-wide association study (GWAS), which is not always true in practice. For example, the extreme phenotype sequencing (EPS) design is widely used to investigate genetic determinants of an outcome in GWAS as it bears many advantages such as efficiency, low sequencing or genotyping cost, and large power in detecting the involvement of rare genetic variants in disease etiology. In this paper, we develop a novel, versatile, and efficient approach, namely MR analysis under Extreme or random Phenotype Sampling (MREPS), for one-sample MR analysis based on samples drawn through either the random sampling design or the nonrandom EPS design. In simulations, MREPS provides unbiased estimates for causal effects, correct type I errors for causal effect testing. Furthermore, it is robust under different study designs and has high power. These results demonstrate the superiority of MREPS over the widely used standard 2SLS approach. We applied MREPS to assess and highlight the causal effect of total fetal hemoglobin on anemia risk in patients with sickle cell anemia using two independent cohort studies. A user-friendly Shiny app web interface was implemented for professionals to easily explore the MREPS
    corecore