51 research outputs found

    Psychometric precision in phenotype definition is a useful step in molecular genetic investigation of psychiatric disorders

    Get PDF
    Affective disorders are highly heritable, but few genetic risk variants have been consistently replicated in molecular genetic association studies. The common method of defining psychiatric phenotypes in molecular genetic research is either a summation of symptom scores or binary threshold score representing the risk of diagnosis. Psychometric latent variable methods can improve the precision of psychiatric phenotypes, especially when the data structure is not straightforward. Using data from the British 1946 birth cohort, we compared summary scores with psychometric modeling based on the General Health Questionnaire (GHQ-28) scale for affective symptoms in an association analysis of 27 candidate genes (249 single-nucleotide polymorphisms (SNPs)). The psychometric method utilized a bi-factor model that partitioned the phenotype variances into five orthogonal latent variable factors, in accordance with the multidimensional data structure of the GHQ-28 involving somatic, social, anxiety and depression domains. Results showed that, compared with the summation approach, the affective symptoms defined by the bi-factor psychometric model had a higher number of associated SNPs of larger effect sizes. These results suggest that psychometrically defined mental health phenotypes can reflect the dimensions of complex phenotypes better than summation scores, and therefore offer a useful approach in genetic association investigations

    Bayesian Methods for the Analysis of Microbiome Data

    No full text
    Bacteria, archaea, viruses, and fungi are present in large numbers both on and inside of our bodies. On average, only one in ten of “our” cells contain human DNA. The other 90% belong to a tremendous diversity of microbes, some of which are fundamentally related to health and disease mechanisms as documented in numerous recent biomedical studies (Turnbaugh et al., 2009; The Human Microbiome Project, 2012b; Knights et al., 2013; Arpaia et al., 2013; Pickard et al., 2014). Some of these microbes are beneficial while others are detrimental, and, since their abundances are poorly understood, identifying microbes associated with interesting phenotypes is of great importance. However, due to the complexity of these systems and certain characteristics of the data there are still limited numbers of appropriate statistical tools available for such a task. In this research work I will describe the basic features of microbiome abundance data and present two new modeling approaches that can be used to address some of the challenges presented by this data type. The first approach accomplishes a data integration and model selection goal by associating covariates with microbiome data. The second provides a method of correcting for multiple hypotheses as is common when testing for differential species abundance between experimental or observational conditions. We illustrate the performances of both methods in simulation studies, and in applications to freely available datasets. Finally, we further discuss their potential in microbiome research and possible future extensions

    An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data

    Get PDF
    Abstract Background The Human Microbiome has been variously associated with the immune-regulatory mechanisms involved in the prevention or development of many non-infectious human diseases such as autoimmunity, allergy and cancer. Integrative approaches which aim at associating the composition of the human microbiome with other available information, such as clinical covariates and environmental predictors, are paramount to develop a more complete understanding of the role of microbiome in disease development. Results In this manuscript, we propose a Bayesian Dirichlet-Multinomial regression model which uses spike-and-slab priors for the selection of significant associations between a set of available covariates and taxa from a microbiome abundance table. The approach allows straightforward incorporation of the covariates through a log-linear regression parametrization of the parameters of the Dirichlet-Multinomial likelihood. Inference is conducted through a Markov Chain Monte Carlo algorithm, and selection of the significant covariates is based upon the assessment of posterior probabilities of inclusions and the thresholding of the Bayesian false discovery rate. We design a simulation study to evaluate the performance of the proposed method, and then apply our model on a publicly available dataset obtained from the Human Microbiome Project which associates taxa abundances with KEGG orthology pathways. The method is implemented in specifically developed R code, which has been made publicly available. Conclusions Our method compares favorably in simulations to several recently proposed approaches for similarly structured data, in terms of increased accuracy and reduced false positive as well as false negative rates. In the application to the data from the Human Microbiome Project, a close evaluation of the biological significance of our findings confirms existing associations in the literature
    corecore