83 research outputs found

    A PheWAS Model Of Autism Spectrum Disorder

    Get PDF
    Children with Autism Spectrum Disorder (ASD) exhibit a wide diversity in type, number, and severity of social deficits as well as communicative and cognitive difficulties. It is a challenge to categorize the phenotypes of a particular ASD patient with their unique genetic variants. There is a need for a better understanding of the connections between genotype information and the phenotypes to sort out the heterogeneity of ASD. In this study, single nucleotide polymorphism (SNP) and phenotype data obtained from a simplex ASD sample are combined using a PheWAS-inspired approach to construct a phenotype-phenotype network. The network is clustered, yielding groups of etiologically related phenotypes. These clusters are analyzed to identify relevant genes associated with each set of phenotypes. The results identified multiple discriminant SNPs associated with varied phenotype clusters such as ASD aberrant behavior (self-injury, compulsiveness and hyperactivity), as well as IQ and language skills. Overall, these SNPs were linked to 22 significant genes. An extensive literature search revealed that eight of these are known to have strong evidence of association with ASD. The others have been linked to related disorders such as mental conditions, cognition, and social functioning, Clinical relevance - This study further informs on connections between certain groups of ASD phenotypes and their unique genetic variants. Such insight regarding the heterogeneity of ASD would support clinicians to advance more tailored interventions and improve outcomes for ASD patients

    Connecting Phenotype To Genotype: PheWAS-inspired Analysis Of Autism Spectrum Disorder

    Get PDF
    Autism Spectrum Disorder (ASD) is extremely heterogeneous clinically and genetically. There is a pressing need for a better understanding of the heterogeneity of ASD based on scientifically rigorous approaches centered on systematic evaluation of the clinical and research utility of both phenotype and genotype markers. This paper presents a holistic PheWAS-inspired method to identify meaningful associations between ASD phenotypes and genotypes. We generate two types of phenotype-phenotype (p-p) graphs: a direct graph that utilizes only phenotype data, and an indirect graph that incorporates genotype as well as phenotype data. We introduce a novel methodology for fusing the direct and indirect p-p networks in which the genotype data is incorporated into the phenotype data in varying degrees. The hypothesis is that the heterogeneity of ASD can be distinguished by clustering the p-p graph. The obtained graphs are clustered using network-oriented clustering techniques, and results are evaluated. The most promising clusterings are subsequently analyzed for biological and domain-based relevance. Clusters obtained delineated different aspects of ASD, including differentiating ASD-specific symptoms, cognitive, adaptive, language and communication functions, and behavioral problems. Some of the important genes associated with the clusters have previous known associations to ASD. We found that clusters based on integrated genetic and phenotype data were more effective at identifying relevant genes than clusters constructed from phenotype information alone. These genes included five with suggestive evidence of ASD association and one known to be a strong candidate

    Modern Views of Machine Learning for Precision Psychiatry

    Full text link
    In light of the NIMH's Research Domain Criteria (RDoC), the advent of functional neuroimaging, novel technologies and methods provide new opportunities to develop precise and personalized prognosis and diagnosis of mental disorders. Machine learning (ML) and artificial intelligence (AI) technologies are playing an increasingly critical role in the new era of precision psychiatry. Combining ML/AI with neuromodulation technologies can potentially provide explainable solutions in clinical practice and effective therapeutic treatment. Advanced wearable and mobile technologies also call for the new role of ML/AI for digital phenotyping in mobile mental health. In this review, we provide a comprehensive review of the ML methodologies and applications by combining neuroimaging, neuromodulation, and advanced mobile technologies in psychiatry practice. Additionally, we review the role of ML in molecular phenotyping and cross-species biomarker identification in precision psychiatry. We further discuss explainable AI (XAI) and causality testing in a closed-human-in-the-loop manner, and highlight the ML potential in multimedia information extraction and multimodal data fusion. Finally, we discuss conceptual and practical challenges in precision psychiatry and highlight ML opportunities in future research

    Developmental and sex modulated neurological alterations in autism spectrum disorder

    Get PDF
    Autism Spectrum Disorder (ASD) was first described in 1943 by Dr. Leo Kranner in a case study published in The Nervous Child. It is a neurodevelopment disorder, with a range of clinical symptoms. According to the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), used by clinicians to diagnose mental disorders, a child needs to have persistent social deficits, language impairments, and repetitive behaviors, that cannot be explained by neurological damage or intellectual disability. It is known that children diagnosed with ASD are often are developmentally delayed therefore alterations in the typical developmental trajectory should be a major factor in consideration when studying ASD. As of 2016, 1 in 68 children in the USA is diagnosed with ASD, of those diagnosed young males are four times more likely to be diagnosed than their female peers. Although genetic and behavioral theories exist to explain these differences, the cause for the disparity is still unknown. This Dissertation presents a unique opportunity to understand the intersection of altered neurodevelopment and the alarming sex disparities in patients with ASD from a neuroimaging perspective. The hypothesis is that there exist differences due to development and sex in with ASD. Access to ABIDE (Autism Brain Imaging Data Exchange), a open source large scale data sharing consortium of functional and anatomical MR data. Analyzing MR data for alterations due to ASD, developmental trajectory, and sex as well as the intersection of these factors. Theses modulations are observed in three Project Aims that employ various analytical approaches: (1) Structural Morphology, (2) Resting-state Functional Connectivity, and (3) Graph Theory. The major findings lie at the interaction of these three factors; developmental stage-by-diagnosis-by-sex. Structural Morphological Analyses of anatomical data show differences in cortical thickness, on the left rostral middle frontal gyrus and surface area in along the sensory motor strip, of the left paracentral gyrus and right precentral gyrus. Resting-state Functional Connectivity analyzed in multiple data driven approaches, and altered resting state connectivity patterns between the left frontal parietal network and the left parahippcampal gyrus are reported. The regions found in the Morphological Analyses are used as seeds for a priori connectivity analysis, connectivity between the left rostral middle frontal cortex and bilateral superior temporal gyrus as well as the right precentral gyrus and right middle frontal gyrus and left inferior frontal gyrus are described. Finally using Graph Theory analysis, which quantifies a whole brain connectivity matrix to calculate metrics such as path length, cluster coefficient, local efficiency, and betweeness centrality all of which are altered by the interaction of all three factors. The last investigation is an attempt to correlate the behavioral assessments, conducted by clinicians with theses neuroimaging findings to determine if there exist a relationship between them. Significant interaction effects of sex and development on ASD diagnosis are observed. The goal of the Study is to provide more information on the disorder that is by nature highly heterogeneous in symptomatology. Studying these interactions, may be key to better understand a disorder that was introduced into the medical literature 75 years ago

    Role of network topology based methods in discovering novel gene-phenotype associations

    Get PDF
    The cell is governed by the complex interactions among various types of biomolecules. Coupled with environmental factors, variations in DNA can cause alterations in normal gene function and lead to a disease condition. Often, such disease phenotypes involve coordinated dysregulation of multiple genes that implicate inter-connected pathways. Towards a better understanding and characterization of mechanisms underlying human diseases, here, I present GUILD, a network-based disease-gene prioritization framework. GUILD associates genes with diseases using the global topology of the protein-protein interaction network and an initial set of genes known to be implicated in the disease. Furthermore, I investigate the mechanistic relationships between disease-genes and explain the robustness emerging from these relationships. I also introduce GUILDify, an online and user-friendly tool which prioritizes genes for their association to any user-provided phenotype. Finally, I describe current state-of-the-art systems-biology approaches where network modeling has helped extending our view on diseases such as cancer.La cèl•lula es regeix per interaccions complexes entre diferents tipus de biomolècules. Juntament amb factors ambientals, variacions en el DNA poden causar alteracions en la funció normal dels gens i provocar malalties. Sovint, aquests fenotips de malaltia involucren una desregulació coordinada de múltiples gens implicats en vies interconnectades. Per tal de comprendre i caracteritzar millor els mecanismes subjacents en malalties humanes, en aquesta tesis presento el programa GUILD, una plataforma que prioritza gens relacionats amb una malaltia en concret fent us de la topologia de xarxe. A partir d’un conjunt conegut de gens implicats en una malaltia, GUILD associa altres gens amb la malaltia mitjancant la topologia global de la xarxa d’interaccions de proteïnes. A més a més, analitzo les relacions mecanístiques entre gens associats a malalties i explico la robustesa es desprèn d’aquesta anàlisi. També presento GUILDify, un servidor web de fácil ús per la priorització de gens i la seva associació a un determinat fenotip. Finalment, descric els mètodes més recents en què el model•latge de xarxes ha ajudat extendre el coneixement sobre malalties complexes, com per exemple a càncer

    ApoE4 effects on the structural covariance brain networks topology in Mild Cognitive Impairment

    Get PDF
    The Apolipoprotein E isoform E4 (ApoE4) is consistently associated with an elevated risk of developing late-onset Alzheimer's Disease (AD). However, little is known about his potential genetic modulation on the structural covariance brain networks during prodromal stages like Mild Cognitive Impairment (MCI). The covariance phenomenon is based on the observation that regions correlating in morphometric descriptors are often part of the same brain system. In a first study, I assessed the ApoE4-related changes on the brain network topology in 256 MCI patients, using the regional cortical thickness to define the covariance network. The cross-sectional sample selected from the ADNI database was subdivided into ApoE4-positive (Carriers) and negative (non-Carriers). At the group-level, the results showed a significant decrease in characteristic path length, clustering index, local efficiency, global connectivity, modularity, and increased global efficiency for Carriers compared to non-Carriers. Overall, I found that ApoE4 in MCI shaped the topological organization of cortical thickness covariance networks. In the second project, I investigated the impact of ApoE4 on the single-subject gray matter networks in a sample of 200 MCI from the ADNI database. The patients were classified based on clinical outcome (stable MCI versus converters to AD) and ApoE4 status (Carriers versus non-Carriers). The effects of ApoE4 and disease progression on the network measures at baseline and rate of change were explored. The topological network attributes were correlated with AD biomarkers. The main findings showed that gray matter network topology is affected independently by ApoE4 and the disease progression (to AD) in late-MCI. The network measures alterations showed a more random organization in Carriers compared to non-Carriers. Finally, as additional research, I investigated whether a network-based approach combined with the graph theory is able to detect cerebrovascular reactivity (CVR) changes in MCI. Our findings suggest that this experimental approach is more sensitive to identifying subtle cerebrovascular alterations than the classical experimental designs. This study paves the way for a future investigation on the ApoE4-cerebrovascular interaction effects on the brain networks during AD progression. In summary, my thesis results provide evidence of the value of the structural covariance brain network measures to capture subtle neurodegenerative changes associated with ApoE4 in MCI. Together with other biomarkers, these variables may help predict disease progression, providing additional reliable intermediate phenotypes

    Integration of multi-scale protein interactions for biomedical data analysis

    Get PDF
    With the advancement of modern technologies, we observe an increasing accumulation of biomedical data about diseases. There is a need for computational methods to sift through and extract knowledge from the diverse data available in order to improve our mechanistic understanding of diseases and improve patient care. Biomedical data come in various forms as exemplified by the various omics data. Existing studies have shown that each form of omics data gives only partial information on cells state and motivated jointly mining multi-omics, multi-modal data to extract integrated system knowledge. The interactome is of particular importance as it enables the modelling of dependencies arising from molecular interactions. This Thesis takes a special interest in the multi-scale protein interactome and its integration with computational models to extract relevant information from biomedical data. We define multi-scale interactions at different omics scale that involve proteins: pairwise protein-protein interactions, multi-protein complexes, and biological pathways. Using hypergraph representations, we motivate considering higher-order protein interactions, highlighting the complementary biological information contained in the multi-scale interactome. Based on those results, we further investigate how those multi-scale protein interactions can be used as either prior knowledge, or auxiliary data to develop machine learning algorithms. First, we design a neural network using the multi-scale organization of proteins in a cell into biological pathways as prior knowledge and train it to predict a patient's diagnosis based on transcriptomics data. From the trained models, we develop a strategy to extract biomedical knowledge pertaining to the diseases investigated. Second, we propose a general framework based on Non-negative Matrix Factorization to integrate the multi-scale protein interactome with multi-omics data. We show that our approach outperforms the existing methods, provide biomedical insights and relevant hypotheses for specific cancer types

    Systematising and scaling literature curation for genetically determined developmental disorders

    Get PDF
    The widespread availability of genomic sequencing has transformed the diagnosis of genetically-determined developmental disorders (GDD). However, this type of test often generates a number of genetic variants, which have to be reviewed and related back to the clinical features (phenotype) of the individual being tested. This frequently entails a time-consuming review of the peer-reviewed literature to look for case reports describing variants in the gene(s) of interest. This is particularly true for newly described and/or very rare disorders not covered in phenotype databases. Therefore, there is a need for scalable, automated literature curation to increase the efficiency of this process. This should lead to improvements in the speed in which diagnosis is made, and an increase in the number of individuals who are diagnosed through genomic testing. Phenotypic data in case reports/case series is not usually recorded in a standardised, computationally-tractable format. Plain text descriptions of similar clinical features may be recorded in several different ways. For example, a technical term such as ‘hypertelorism’, may be recorded as its synonym ‘widely spaced eyes’. In addition, case reports are found across a wide range of journals, with different structures and file formats for each publication. The Human Phenotype Ontology (HPO) was developed to store phenotypic data in a computationally-accessible format. Several initiatives have been developed to link diseases to phenotype data, in the form of HPO terms. However, these rely on manual expert curation and therefore are not inherently scalable, and cannot be updated automatically. Methods of extracting phenotype data from text at scale developed to date have relied on abstracts or open access papers. At the time of writing, Europe PubMed Central (EPMC, https://europepmc.org/) contained approximately 39.5 million articles, of which only 3.8 million were open access. Therefore, there is likely a significant volume of phenotypic data which has not been used previously at scale, due to difficulties accessing non-open access manuscripts. In this thesis, I present a method for literature curation which can utilise all relevant published full text through a newly developed package which can download almost all manuscripts licenced by a university or other institution. This is scalable to the full spectrum of GDD. Using manuscripts identified through manual literature review, I use a full text download pipeline and NLP (natural language processing) based methods to generate disease models. These are comprised of HPO terms weighted according to their frequency in the literature. I demonstrate iterative refinement of these models, and use a custom annotated corpus of 50 papers to show the text mining process has high precision and recall. I demonstrate that these models clinically reflect true disease expressivity, as defined by manual comparison with expert literature reviews, for three well-characterised GDD. I compare these disease models to those in the most commonly used genetic disease phenotype databases. I show that the automated disease models have increased depth of phenotyping, i.e. there are more terms than those which are manually-generated. I show that, in comparison to ‘real life’ prospectively gathered phenotypic data, automated disease models outperform existing phenotype databases in predicting diagnosis, as defined by increased area under the curve (by 0.05 and 0.08 using different similarity measures) on ROC curve plots. I present a method for automated PubMed search at scale, to use as input for disease model generation. I annotated a corpus of 6500 abstracts. Using this corpus I show a high precision (up to 0.80) and recall (up to 1.00) for machine learning classifiers used to identify manuscripts relevant to GDD. These use hand-picked domain-specific features, for example utilising specific MeSH terms. This method can be used to scale automated literature curation to the full spectrum of GDD. I also present an analysis of the phenotypic terms used in one year of GDD-relevant papers in a prominent journal. This shows that use of supplemental data and parsing clinical report sections from manuscripts is likely to result in more patient-specific phenotype extraction in future. In summary, I present a method for automated curation of full text from the peer-reviewed literature in the context of GDD. I demonstrate that this method is robust, reflects clinical disease expressivity, outperforms existing manual literature curation, and is scalable. Applying this process to clinical testing in future should improve the efficiency and accuracy of diagnosis

    Genetic and environmental influences on the brain functional networks in older adults

    Full text link
    As humans age, the functional organisation of their brain networks undergoes complex changes that are associated with observed changes in cognition. Both genetics and the environment play a crucial role in influencing changes in the network topology of the ageing brain. In addition, the network topology is influenced by age-related brain diseases. To date, there is a paucity of population-based studies investigating the contributions of age, genetic and environmental factors, and brain disease to the architecture of the functional brain networks. The broad aim of this thesis, therefore, was to examine the influence of genetics, environmental factors, and disease-states on functional brain networks in older individuals using the United Kingdom (UK) Biobank data (N~18,455; ages 44-80 years). To study functional brain networks, I modelled large-scale brain networks from resting-state functional magnetic resonance imaging (fMRI) scans using graph theory, defined by a collection of nodes (brain regions) and edges (magnitude of temporal correlation in activity on fMRI between two brain regions). Four studies are reported in the thesis. In the first study, I investigated the genetic determinants of functional brain networks. I first estimated single nucleotide polymorphism (SNP) heritability (h2). Subsequently, genome-wide association studies (GWAS) were performed to identify genetic variants associated with each graph theory measure. Gene-based association analysis was carried out to uncover gene-level associations, and the functional consequences of the significant genetic variants were explored. As brain reorganisation of the functional networks has been differentially observed with ageing in the two sexes, I examined in the second study how age and sex are associated with the topology of functional brain networks in association with cognitive performance. In the third study, I examined the association of sleep and other lifestyle factors such as exercise, alcohol, and smoking, with functional network properties. In the final study, I studied how disease phenotypes, in particular depressive symptoms, influence functional network properties. This thesis provides several novel contributions to the literature by identifying important genetic, environmental, and disease-related factors that are associated with measures of functional networks in the ageing brain. The findings highlight biological pathways relevant to the ageing human brain functional network integrity and diseases that affect it
    corecore