104 research outputs found

    A Regression-based Approach to Robust Estimation and Inference for Genetic Covariance

    No full text
    Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex traits, and some variants are shown to be associated with multiple complex traits. Genetic covariance between two traits is defined as the underlying covariance of genetic effects and can be used to measure the shared genetic architecture. The data used to estimate such a genetic covariance can be from the same group or different groups of individuals, and the traits can be of different types or collected based on different study designs. This paper proposes a unified regression-based approach to robust estimation and inference for genetic covariance of general traits that may be associated with genetic variants nonlinearly. The asymptotic properties of the proposed estimator are provided and are shown to be robust under certain model mis-specification. Our method under linear working models provides a robust inference for the narrow-sense genetic covariance, even when both linear models are mis-specified. Numerical experiments are performed to support the theoretical results. Our method is applied to an outbred mice GWAS data set to study the overlapping genetic effects between the behavioral and physiological phenotypes. The real data results reveal interesting genetic covariance among different mice developmental traits.</p

    Transfer Learning in Large-Scale Gaussian Graphical Models with False Discovery Rate Control

    No full text
    Transfer learning for high-dimensional Gaussian graphical models (GGMs) is studied. The target GGM is estimated by incorporating the data from similar and related auxiliary studies, where the similarity between the target graph and each auxiliary graph is characterized by the sparsity of a divergence matrix. An estimation algorithm, Trans-CLIME, is proposed and shown to attain a faster convergence rate than the minimax rate in the single-task setting. Furthermore, we introduce a universal debiasing method that can be coupled with a range of initial graph estimators and can be analytically computed in one step. A debiased Trans-CLIME estimator is then constructed and is shown to be element-wise asymptotically normal. This fact is used to construct a multiple testing procedure for edge detection with false discovery rate control. The proposed estimation and multiple testing procedures demonstrate superior numerical performance in simulations and are applied to infer the gene networks in a target brain tissue by leveraging the gene expressions from multiple other brain tissues. A significant decrease in prediction errors and a significant increase in power for link detection are observed. Supplementary materials for this article are available online.</p

    Large Covariance Estimation for Compositional Data Via Composition-Adjusted Thresholding

    No full text
    <p>High-dimensional compositional data arise naturally in many applications such as metagenomic data analysis. The observed data lie in a high-dimensional simplex, and conventional statistical methods often fail to produce sensible results due to the unit-sum constraint. In this article, we address the problem of covariance estimation for high-dimensional compositional data and introduce a composition-adjusted thresholding (COAT) method under the assumption that the basis covariance matrix is sparse. Our method is based on a decomposition relating the compositional covariance to the basis covariance, which is approximately identifiable as the dimensionality tends to infinity. The resulting procedure can be viewed as thresholding the sample centered log-ratio covariance matrix and hence is scalable for large covariance matrices. We rigorously characterize the identifiability of the covariance parameters, derive rates of convergence under the spectral norm, and provide theoretical guarantees on support recovery. Simulation studies demonstrate that the COAT estimator outperforms some existing optimization-based estimators. We apply the proposed method to the analysis of a microbiome dataset to understand the dependence structure among bacterial taxa in the human gut.</p

    Optimal Permutation Recovery in Permuted Monotone Matrix Model

    No full text
    Motivated by recent research on quantifying bacterial growth dynamics based on genome assemblies, we consider a permuted monotone matrix modelY=ΘΠ+Z, where the rows represent different samples, the columns represent contigs in genome assemblies and the elements represent log-read counts after preprocessing steps and Guanine-Cytosine (GC) adjustment. In this model, Θ is an unknown mean matrix with monotone entries for each row, Π is a permutation matrix that permutes the columns of Θ, and Z is a noise matrix. This article studies the problem of estimation/recovery of Π given the observed noisy matrix Y. We propose an estimator based on the best linear projection, which is shown to be minimax rate-optimal for both exact recovery, as measured by the 0-1 loss, and partial recovery, as quantified by the normalized Kendall’s tau distance. Simulation studies demonstrate the superior empirical performance of the proposed estimator over alternative methods. We demonstrate the methods using a synthetic metagenomics dataset of 45 closely related bacterial species and a real metagenomic dataset to compare the bacterial growth dynamics between the responders and the nonresponders of the IBD patients after 8 weeks of treatment. Supplementary materials for this article are available online.</p

    Optimal Estimation of Wasserstein Distance on a Tree With an Application to Microbiome Studies

    No full text
    The weighted UniFrac distance, a plug-in estimator of the Wasserstein distance of read counts on a tree, has been widely used to measure the microbial community difference in microbiome studies. Our investigation however shows that such a plug-in estimator, although intuitive and commonly used in practice, suffers from potential bias. Motivated by this finding, we study the problem of optimal estimation of the Wasserstein distance between two distributions on a tree from the sampled data in the high-dimensional setting. The minimax rate of convergence is established. To overcome the bias problem, we introduce a new estimator, referred to as the moment-screening estimator on a tree (MET), by using implicit best polynomial approximation that incorporates the tree structure. The new estimator is computationally efficient and is shown to be minimax rate-optimal. Numerical studies using both simulated and real biological datasets demonstrate the practical merits of MET, including reduced biases and statistically more significant differences in microbiome between the inactive Crohn’s disease patients and the normal controls. Supplementary materials for this article are available online.</p

    Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models

    No full text
    High-dimensional logistic regression is widely used in analyzing data with binary outcomes. In this article, global testing and large-scale multiple testing for the regression coefficients are considered in both single- and two-regression settings. A test statistic for testing the global null hypothesis is constructed using a generalized low-dimensional projection for bias correction and its asymptotic null distribution is derived. A lower bound for the global testing is established, which shows that the proposed test is asymptotically minimax optimal over some sparsity range. For testing the individual coefficients simultaneously, multiple testing procedures are proposed and shown to control the false discovery rate and falsely discovered variables asymptotically. Simulation studies are carried out to examine the numerical performance of the proposed tests and their superiority over existing methods. The testing procedures are also illustrated by analyzing a dataset of a metabolomics study that investigates the association between fecal metabolites and pediatric Crohn’s disease and the effects of treatment on such associations. Supplementary materials for this article are available online.</p

    Estimation and Inference for High-Dimensional Generalized Linear Models with Knowledge Transfer

    No full text
    Transfer learning provides a powerful tool for incorporating data from related studies into a target study of interest. In epidemiology and medical studies, the classification of a target disease could borrow information across other related diseases and populations. In this work, we consider transfer learning for high-dimensional generalized linear models (GLMs). A novel algorithm, TransHDGLM, that integrates data from the target study and the source studies is proposed. Minimax rate of convergence for estimation is established and the proposed estimator is shown to be rate-optimal. Statistical inference for the target regression coefficients is also studied. Asymptotic normality for a debiased estimator is established, which can be used for constructing coordinate-wise confidence intervals of the regression coefficients. Numerical studies show significant improvement in estimation and inference accuracy over GLMs that only use the target data. The proposed methods are applied to a real data study concerning the classification of colorectal cancer using gut microbiomes, and are shown to enhance the classification accuracy in comparison to methods that only use the target data.</p

    Synthesis and Characterization of Polyrotaxanes Consisting of Cationic α-Cyclodextrins Threaded on Poly[(ethylene oxide)-<i>ran</i>-(propylene oxide)] as Gene Carriers

    No full text
    Cationic polymers have been receiving growing attention as gene delivery carriers. Herein, a series of novel cationic supramolecular polyrotaxanes with multiple cationic α-cyclodextrin (α-CD) rings threaded and blocked on a poly[(ethylene oxide)-ran-(propylene oxide)] (P(EO-r-PO)) random copolymer chain were synthesized and investigated for gene delivery. In the cationic polyrotaxanes, approximately 12 cationic α-CD rings were threaded on the P(EO-r-PO) copolymer with a molecular weight of 2370 Da and an EO/PO molar ratio of 4:1, while the cationic α-CD rings were grafted with linear or branched oligoethylenimine (OEI) of various chain lengths and molecular weights up to 600 Da. The OEI-grafted α-CD rings were only located selectively on EO segments of the P(EO-r-PO) chain, while PO segments were free of complexation. This increased the mobility of the cationic α-CD rings and the flexibility of the polyrotaxanes, which enhanced the interaction of the cationic α-CD rings with DNA and/or the cellular membrane. All cationic polyrotaxanes synthesized in this work could efficiently condense plasmid DNA to form nanoparticles that were suitable for delivery of the gene. Cytotoxicity studies showed that the cationic polyrotaxanes with all linear OEI chains of molecular weights up to 423 Da exhibited much less cytotoxicity than high-molecular-weight branched polyethylenimine (PEI) (25 kDa) in both HEK293 and COS7 cell lines. The cationic polyrotaxanes displayed high gene transfection efficiencies in a variety of cell lines including HEK293, COS7, BHK-21, SKOV-3, and MES-SA. Particularly, the gene delivery capability of the cationic polyrotaxanes in HEK293 cells was much higher than that of high-molecular-weight branched PEI (25 k)

    Distribution of GTTR differs in hair cells and supporting cells of the LSC.

    No full text
    <p><b>An xy focal series of images through the z axis of the LSC from the apical surface into the base of crista were shown</b>. The sensory epithelia of LSC crista fixed 2 hours after systemic GTTR injection (A1–5 red channel, GTTR; B1–5 merged red and green channels; red, GTTR; green, Alexa-488 conjugated phalloidin to visualize filamentous actin). Hair cells displayed only diffuse GTTR fluorescence (HC) while supporting cells (SC) exhibited both intensely fluorescent GTTR puncta (arrows) and diffuse GTTR fluorescence. In color images (B1–5), the actin-rich hair bundle (B) is distinctly localized in B1-B3, and the cuticular plate is shown in B1 and B2 (arrowheads). Characteristic phalloidin labeling is also shown at the hair cell circumference as a distinctive green “dotted” ring in B3 and B4. Scale bar in B5 = 5 <i>μ</i>m.</p

    Efektivitas pembelajaran eksperimen terhadap hasil belajar, pemahaman konsep, dan keaktifan siswa kelas X SMA Negeri 2 Yogyakarta pada pokok bahasan gerak harmonik sederhana tahun ajaran 2016-2017

    Get PDF
    <p>Diffuse, cytoplasmic GTTR fluorescence was detected in saccular and utricular hair cells at 0.5 hours and significantly increased in intensity over time to peak at 3 hours after systemic injection of GTTR. At 4 hours, diffuse cytoplasmic fluorescence was significantly attenuated compared to the 3 hour time point (* p<0.05, ** p<0.01, ***p<0.001; mean ± s.d.; n = 5).</p
    corecore