18 research outputs found

    Statistical tools for general association testing and control of false discoveries in group testing

    Get PDF
    In modern applications of high-throughput technologies, it is important to identify pairwise associations between variables, and desirable to use methods that are powerful and sensitive to a variety of association relationships. In the first part of the dissertation, we describe RankCover, a new non-parametric association test for association between two variables that measures the concentration of paired ranked points. Here `concentration' is quantified using a disk-covering statistic that is similar to those employed in spatial data analysis. Analysis of simulated datasets demonstrates that the method is robust and often powerful in comparison to competing general association tests. We also illustrate RankCover in the analysis of several real datasets. Using RankCover, we also propose a method of testing the association of two variables while controlling the effect of a third variable. In the second part of the dissertation, we describe statistical methodologies for testing hypotheses that can be collected into groups, with each group showing potentially different characteristics. Methods to control family-wise error rate or false discovery rate for group testing have been proposed earlier, but may not easily apply to expression quantitative trait loci (eQTL) data, for which certain structured alternatives may be defensible and enable the researcher to avoid overly conservative approaches. In an empirical Bayesian setting, we propose a new method to control the false discovery rate (FDR) for grouped hypothesis data. Here, each gene forms a group, with SNPs annotated to the gene corresponding to individual hypotheses. Heterogeneity of effect sizes in different groups is considered by the introduction of a random effects component. Our method, entitled Random Effects model and testing procedure for Group-level FDR control (REG-FDR) assumes a model for alternative hypotheses for the eQTL data and controls the FDR by adaptive thresholding. Finally, we propose Z-REG-FDR, an approximate version of REG-FDR that uses only Z-statistics of association between genotype and expression at each SNP. Simulations demonstrate that Z-REG-FDR performed similarly to REG-FDR, but with much improved computational speed. We further propose an extension of Z-REG-FDR to a multi-tissue setting, providing the basis for gene-based multi-tissue analysis.Doctor of Philosoph

    Testing cross‐phenotype effects of rare variants in longitudinal studies of complex traits

    Full text link
    Many gene mapping studies of complex traits have identified genes or variants that influence multiple phenotypes. With the advent of next‐generation sequencing technology, there has been substantial interest in identifying rare variants in genes that possess cross‐phenotype effects. In the presence of such effects, modeling both the phenotypes and rare variants collectively using multivariate models can achieve higher statistical power compared to univariate methods that either model each phenotype separately or perform separate tests for each variant. Several studies collect phenotypic data over time and using such longitudinal data can further increase the power to detect genetic associations. Although rare‐variant approaches exist for testing cross‐phenotype effects at a single time point, there is no analogous method for performing such analyses using longitudinal outcomes. In order to fill this important gap, we propose an extension of Gene Association with Multiple Traits (GAMuT) test, a method for cross‐phenotype analysis of rare variants using a framework based on the distance covariance. The approach allows for both binary and continuous phenotypes and can also adjust for covariates. Our simple adjustment to the GAMuT test allows it to handle longitudinal data and to gain power by exploiting temporal correlation. The approach is computationally efficient and applicable on a genome‐wide scale due to the use of a closed‐form test whose significance can be evaluated analytically. We use simulated data to demonstrate that our method has favorable power over competing approaches and also apply our approach to exome chip data from the Genetic Epidemiology Network of Arteriopathy.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/144294/1/gepi22121_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/144294/2/gepi22121.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/144294/3/gepi22121-sup-0001-SuppMat.pd

    Predictive modeling of miRNA-mediated predisposition to alcohol-related phenotypes in mouse

    No full text
    Abstract Background MicroRNAs (miRNAs) are small non-coding RNAs that bind messenger RNAs and promote their degradation or repress their translation. There is increasing evidence of miRNAs playing an important role in alcohol related disorders. However, the role of miRNAs as mediators of the genetic effect on alcohol phenotypes is not fully understood. We conducted a high-throughput sequencing study to measure miRNA expression levels in alcohol naïve animals in the LXS panel of recombinant inbred (RI) mouse strains. We then combined the sequencing data with genotype data, microarry gene expression data, and data on alcohol-related behavioral phenotypes such as ’Drinking in the dark’, ’Sleep time’, and ’Low dose activation’ from the same RI panel. SNP-miRNA-gene triplets with strong association within the triplet that were also associated with one of the 4 alcohol phenotypes were selected and a Bayesian network analysis was used to aggregate results into a directed network model. Results We found several triplets with strong association within the triplet that were also associated with one of the alcohol phenotypes. The Bayesian network analysis found two networks where a miRNA mediates the genetic effect on the alcohol phenotype. The miRNAs were found to influence the expression of protein-coding genes, which in turn influences the quantitative phenotypes. The pathways in which these genes are enriched have been previously associated with alcohol-related traits. Conclusion This work enhances association studies by identifying miRNAs that may be mediating the association between genetic markers (SNPs) and the alcohol phenotypes. It suggests a mechanism of how genetic variants are affecting traits of interest through the modification of miRNA expression

    Systems genetics analysis of the LXS recombinant inbred mouse strains:Genetic and molecular insights into acute ethanol tolerance.

    No full text
    We have been using the Inbred Long- and Short-Sleep mouse strains (ILS, ISS) and a recombinant inbred panel derived from them, the LXS, to investigate the genetic underpinnings of acute ethanol tolerance which is considered to be a risk factor for alcohol use disorders (AUDs). Here, we have used RNA-seq to examine the transcriptome of whole brain in 40 of the LXS strains 8 hours after a saline or ethanol "pretreatment" as in previous behavioral studies. Approximately 1/3 of the 14,184 expressed genes were significantly heritable and many were unique to the pretreatment. Several thousand cis- and trans-eQTLs were mapped; a portion of these also were unique to pretreatment. Ethanol pretreatment caused differential expression (DE) of 1,230 genes. Gene Ontology (GO) enrichment analysis suggested involvement in numerous biological processes including astrocyte differentiation, histone acetylation, mRNA splicing, and neuron projection development. Genetic correlation analysis identified hundreds of genes that were correlated to the behaviors. GO analysis indicated that these genes are involved in gene expression, chromosome organization, and protein transport, among others. The expression profiles of the DE genes and genes correlated to AFT in the ethanol pretreatment group (AFT-Et) were found to be similar to profiles of HDAC inhibitors. Hdac1, a cis-regulated gene that is located at the peak of a previously mapped QTL for AFT-Et, was correlated to 437 genes, most of which were also correlated to AFT-Et. GO analysis of these genes identified several enriched biological process terms including neuron-neuron synaptic transmission and potassium transport. In summary, the results suggest widespread genetic effects on gene expression, including effects that are pretreatment-specific. A number of candidate genes and biological functions were identified that could be mediating the behavioral responses. The most prominent of these was Hdac1 which may be regulating genes associated with glutamatergic signaling and potassium conductance
    corecore