142 research outputs found

    Large-scale machine learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology.

    Get PDF
    Genome-wide association studies (GWASs) require accurate cohort phenotyping, but expert labeling can be costly, time intensive, and variable. Here, we develop a machine learning (ML) model to predict glaucomatous optic nerve head features from color fundus photographs. We used the model to predict vertical cup-to-disc ratio (VCDR), a diagnostic parameter and cardinal endophenotype for glaucoma, in 65,680 Europeans in the UK Biobank (UKB). A GWAS of ML-based VCDR identified 299 independent genome-wide significant (GWS; p ≤ 5 × 10-8) hits in 156 loci. The ML-based GWAS replicated 62 of 65 GWS loci from a recent VCDR GWAS in the UKB for which two ophthalmologists manually labeled images for 67,040 Europeans. The ML-based GWAS also identified 93 novel loci, significantly expanding our understanding of the genetic etiologies of glaucoma and VCDR. Pathway analyses support the biological significance of the novel hits to VCDR: select loci near genes involved in neuronal and synaptic biology or harboring variants are known to cause severe Mendelian ophthalmic disease. Finally, the ML-based GWAS results significantly improve polygenic prediction of VCDR and primary open-angle glaucoma in the independent EPIC-Norfolk cohort

    ART: A machine learning Automated Recommendation Tool for synthetic biology

    Get PDF
    Biology has changed radically in the last two decades, transitioning from a descriptive science into a design science. Synthetic biology allows us to bioengineer cells to synthesize novel valuable molecules such as renewable biofuels or anticancer drugs. However, traditional synthetic biology approaches involve ad-hoc engineering practices, which lead to long development times. Here, we present the Automated Recommendation Tool (ART), a tool that leverages machine learning and probabilistic modeling techniques to guide synthetic biology in a systematic fashion, without the need for a full mechanistic understanding of the biological system. Using sampling-based optimization, ART provides a set of recommended strains to be built in the next engineering cycle, alongside probabilistic predictions of their production levels. We demonstrate the capabilities of ART on simulated data sets, as well as experimental data from real metabolic engineering projects producing renewable biofuels, hoppy flavored beer without hops, and fatty acids. Finally, we discuss the limitations of this approach, and the practical consequences of the underlying assumptions failing

    A peridynamic based machine learning model for one-dimensional and two-dimensional structures

    Get PDF
    With the rapid growth of available data and computing resources, using data-driven models is a potential approach in many scientific disciplines and engineering. However, for complex physical phenomena that have limited data, the data-driven models are lacking robustness and fail to provide good predictions. Theory-guided data science is the recent technology that can take advantage of both physics-driven and data-driven models. This study presents a novel peridynamics based machine learning model for one and two-dimensional structures. The linear relationships between the displacement of a material point and displacements of its family members and applied forces are obtained for the machine learning model by using linear regression. The numerical procedure for coupling the peridynamic model and the machine learning model is also provided. The numerical procedure for coupling the peridynamic model and the machine learning model is also provided. The accuracy of the coupled model is verified by considering various examples of a one-dimensional bar and two-dimensional plate. To further demonstrate the capabilities of the coupled model, damage prediction for a plate with a pre-existing crack, a two-dimensional representation of a three-point bending test, and a plate subjected to dynamic load are simulated

    Identification of novel risk loci for restless legs syndrome in genome-wide association studies in individuals of European ancestry : a meta-analysis

    Get PDF
    Background Restless legs syndrome is a prevalent chronic neurological disorder with potentially severe mental and physical health consequences. Clearer understanding of the underlying pathophysiology is needed to improve treatment options. We did a meta-analysis of genome-wide association studies (GWASs) to identify potential molecular targets. Methods In the discovery stage, we combined three GWAS datasets (EU-RLS GENE, INTERVAL, and 23andMe) with diagnosis data collected from 2003 to 2017, in face-to-face interviews or via questionnaires, and involving 15126 cases and 95 725 controls of European ancestry. We identified common variants by fixed-effect inverse-variance meta-analysis. Significant genome-wide signals (p Findings We identified and replicated 13 new risk loci for restless legs syndrome and confirmed the previously identified six risk loci. MEIS1 was confirmed as the strongest genetic risk factor for restless legs syndrome (odds ratio 1.92, 95% CI 1 85-1.99). Gene prioritisation, enrichment, and genetic correlation analyses showed that identified pathways were related to neurodevelopment and highlighted genes linked to axon guidance (associated with SEMA6D), synapse formation (NTNG1), and neuronal specification (HOXB cluster family and MYT1). Interpretation Identification of new candidate genes and associated pathways will inform future functional research. Advances in understanding of the molecular mechanisms that underlie restless legs syndrome could lead to new treatment options. We focused on common variants; thus, additional studies are needed to dissect the roles of rare and structural variations.Peer reviewe

    Gastroesophageal reflux GWAS identifies risk loci that also associate with subsequent severe esophageal diseases

    Get PDF
    Funder: The Swedish Esophageal Cancer Study was funded by grants (R01 CA57947-03) from the National Cancer Institute he California Tobacco Related Research Program (3RT-0122; and; 10RT-0251) Marit Peterson Fund for Melanoma Research. CIDR is supported by contract HHSN268200782096CAbstract: Gastroesophageal reflux disease (GERD) is caused by gastric acid entering the esophagus. GERD has high prevalence and is the major risk factor for Barrett’s esophagus (BE) and esophageal adenocarcinoma (EA). We conduct a large GERD GWAS meta-analysis (80,265 cases, 305,011 controls), identifying 25 independent genome-wide significant loci for GERD. Several of the implicated genes are existing or putative drug targets. Loci discovery is greatest with a broad GERD definition (including cases defined by self-report or medication data). Further, 91% of the GERD risk-increasing alleles also increase BE and/or EA risk, greatly expanding gene discovery for these traits. Our results map genes for GERD and related traits and uncover potential new drug targets for these conditions

    Identification of common genetic risk variants for autism spectrum disorder

    Get PDF
    Autism spectrum disorder (ASD) is a highly heritable and heterogeneous group of neurodevelopmental phenotypes diagnosed in more than 1% of children. Common genetic variants contribute substantially to ASD susceptibility, but to date no individual variants have been robustly associated with ASD. With a marked sample-size increase from a unique Danish population resource, we report a genome-wide association meta-analysis of 18,381 individuals with ASD and 27,969 controls that identified five genome-wide-significant loci. Leveraging GWAS results from three phenotypes with significantly overlapping genetic architectures (schizophrenia, major depression, and educational attainment), we identified seven additional loci shared with other traits at equally strict significance levels. Dissecting the polygenic architecture, we found both quantitative and qualitative polygenic heterogeneity across ASD subtypes. These results highlight biological insights, particularly relating to neuronal function and corticogenesis, and establish that GWAS performed at scale will be much more productive in the near term in ASD

    Scaling up genetic circuit design for cellular computing:advances and prospects

    Get PDF

    Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals

    Get PDF
    We conduct a genome-wide association study (GWAS) of educational attainment (EA) in a sample of ~3 million individuals and identify 3,952 approximately uncorrelated genome-wide-significant single-nucleotide polymorphisms (SNPs). A genome-wide polygenic predictor, or polygenic index (PGI), explains 12-16% of EA variance and contributes to risk prediction for ten diseases. Direct effects (i.e., controlling for parental PGIs) explain roughly half the PGI's magnitude of association with EA and other phenotypes. The correlation between mate-pair PGIs is far too large to be consistent with phenotypic assortment alone, implying additional assortment on PGI-associated factors. In an additional GWAS of dominance deviations from the additive model, we identify no genome-wide-significant SNPs, and a separate X-chromosome additive GWAS identifies 57

    Identification of common genetic risk variants for autism spectrum disorder

    Get PDF
    Autism spectrum disorder (ASD) is a highly heritable and heterogeneous group of neurodevelopmental phenotypes diagnosed in more than 1% of children. Common genetic variants contribute substantially to ASD susceptibility, but to date no individual variants have been robustly associated with ASD. With a marked sample-size increase from a unique Danish population resource, we report a genome-wide association meta-analysis of 18,381 individuals with ASD and 27,969 controls that identified five genome-wide-significant loci. Leveraging GWAS results from three phenotypes with significantly overlapping genetic architectures (schizophrenia, major depression, and educational attainment), we identified seven additional loci shared with other traits at equally strict significance levels. Dissecting the polygenic architecture, we found both quantitative and qualitative polygenic heterogeneity across ASD subtypes. These results highlight biological insights, particularly relating to neuronal function and corticogenesis, and establish that GWAS performed at scale will be much more productive in the near term in ASD.Peer reviewe
    corecore