78 research outputs found

    Fast and accurate semantic annotation of bioassays exploiting a hybrid of machine learning and user confirmation

    Get PDF
    Bioinformatics and computer aided drug design rely on the curation of a large number of protocols for biological assays that measure the ability of potential drugs to achieve a therapeutic effect. These assay protocols are generally published by scientists in the form of plain text, which needs to be more precisely annotated in order to be useful to software methods. We have developed a pragmatic approach to describing assays according to the semantic definitions of the BioAssay Ontology (BAO) project, using a hybrid of machine learning based on natural language processing, and a simplified user interface designed to help scientists curate their data with minimum effort. We have carried out this work based on the premise that pure machine learning is insufficiently accurate, and that expecting scientists to find the time to annotate their protocols manually is unrealistic. By combining these approaches, we have created an effective prototype for which annotation of bioassay text within the domain of the training set can be accomplished very quickly. Well-trained annotations require single-click user approval, while annotations from outside the training set domain can be identified using the search feature of a well-designed user interface, and subsequently used to improve the underlying models. By drastically reducing the time required for scientists to annotate their assays, we can realistically advocate for semantic annotation to become a standard part of the publication process. Once even a small proportion of the public body of bioassay data is marked up, bioinformatics researchers can begin to construct sophisticated and useful searching and analysis algorithms that will provide a diverse and powerful set of tools for drug discovery researchers

    A Novel Hap1-Tsc1 interaction regulates neuronal mTORC1 signaling and morphogenesis in the brain

    Get PDF
    Tuberous sclerosis complex (TSC) is a leading genetic cause of autism. The TSC proteins Tsc1 and Tsc2 control the mTORC1 signaling pathway in diverse cells, but how the mTORC1 pathway is specifically regulated in neurons remains to be elucidated. Here, using an interaction proteomics approach in neural cells including neurons, we uncover the brain-enriched protein huntingtin-associated protein 1 (Hap1) as a novel functional partner of Tsc1. Knockdown of Hap1 promotes specification of supernumerary axons in primary hippocampal neurons and profoundly impairs the positioning of pyramidal neurons in the mouse hippocampus in vivo. The Hap1 knockdown-induced phenotypes in primary neurons and in vivo recapitulate the phenotypes induced by Tsc1 knockdown. We also find that Hap1 knockdown in hippocampal neurons induces the downregulation of Tsc1 and stimulates the activity of mTORC1, as reflected by phosphorylation of the ribosomal protein S6. Inhibition of mTORC1 activity suppresses the Hap1 knockdown-induced polarity phenotype in hippocampal neurons. Collectively, these findings define a novel link between Hap1 and Tsc1 that regulates neuronal mTORC1 signaling and neuronal morphogenesis, with implications for our understanding of developmental disorders of cognition

    Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals

    Get PDF
    Publisher Copyright: © 2022, The Author(s).We conduct a genome-wide association study (GWAS) of educational attainment (EA) in a sample of ~3 million individuals and identify 3,952 approximately uncorrelated genome-wide-significant single-nucleotide polymorphisms (SNPs). A genome-wide polygenic predictor, or polygenic index (PGI), explains 12–16% of EA variance and contributes to risk prediction for ten diseases. Direct effects (i.e., controlling for parental PGIs) explain roughly half the PGI’s magnitude of association with EA and other phenotypes. The correlation between mate-pair PGIs is far too large to be consistent with phenotypic assortment alone, implying additional assortment on PGI-associated factors. In an additional GWAS of dominance deviations from the additive model, we identify no genome-wide-significant SNPs, and a separate X-chromosome additive GWAS identifies 57.Peer reviewe

    An OBSL1-Cul7Fbxw8 Ubiquitin Ligase Signaling Mechanism Regulates Golgi Morphology and Dendrite Patterning

    Get PDF
    The elaboration of dendrites in neurons requires secretory trafficking through the Golgi apparatus, but the mechanisms that govern Golgi function in neuronal morphogenesis in the brain have remained largely unexplored. Here, we report that the E3 ubiquitin ligase Cul7Fbxw8 localizes to the Golgi complex in mammalian brain neurons. Inhibition of Cul7Fbxw8 by independent approaches including Fbxw8 knockdown reveals that Cul7Fbxw8 is selectively required for the growth and elaboration of dendrites but not axons in primary neurons and in the developing rat cerebellum in vivo. Inhibition of Cul7Fbxw8 also dramatically impairs the morphology of the Golgi complex, leading to deficient secretory trafficking in neurons. Using an immunoprecipitation/mass spectrometry screening approach, we also uncover the cytoskeletal adaptor protein OBSL1 as a critical regulator of Cul7Fbxw8 in Golgi morphogenesis and dendrite elaboration. OBSL1 forms a physical complex with the scaffold protein Cul7 and thereby localizes Cul7 at the Golgi apparatus. Accordingly, OBSL1 is required for the morphogenesis of the Golgi apparatus and the elaboration of dendrites. Finally, we identify the Golgi protein Grasp65 as a novel and physiologically relevant substrate of Cul7Fbxw8 in the control of Golgi and dendrite morphogenesis in neurons. Collectively, these findings define a novel OBSL1-regulated Cul7Fbxw8 ubiquitin signaling mechanism that orchestrates the morphogenesis of the Golgi apparatus and patterning of dendrites, with fundamental implications for our understanding of brain development

    Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use

    Get PDF
    Tobacco and alcohol use are leading causes of mortality that influence risk for many complex diseases and disorders 1 . They are heritable 2,3 and etiologically related 4,5 behaviors that have been resistant to gene discovery efforts 6–11 . In sample sizes up to 1.2 million individuals, we discovered 566 genetic variants in 406 loci associated with multiple stages of tobacco use (initiation, cessation, and heaviness) as well as alcohol use, with 150 loci evidencing pleiotropic association. Smoking phenotypes were positively genetically correlated with many health conditions, whereas alcohol use was negatively correlated with these conditions, such that increased genetic risk for alcohol use is associated with lower disease risk. We report evidence for the involvement of many systems in tobacco and alcohol use, including genes involved in nicotinic, dopaminergic, and glutamatergic neurotransmission. The results provide a solid starting point to evaluate the effects of these loci in model organisms and more precise substance use measures

    Identification of novel risk loci, causal insights, and heritable risk for Parkinson's disease: a meta-analysis of genome-wide association studies

    Get PDF
    Background Genome-wide association studies (GWAS) in Parkinson's disease have increased the scope of biological knowledge about the disease over the past decade. We aimed to use the largest aggregate of GWAS data to identify novel risk loci and gain further insight into the causes of Parkinson's disease. Methods We did a meta-analysis of 17 datasets from Parkinson's disease GWAS available from European ancestry samples to nominate novel loci for disease risk. These datasets incorporated all available data. We then used these data to estimate heritable risk and develop predictive models of this heritability. We also used large gene expression and methylation resources to examine possible functional consequences as well as tissue, cell type, and biological pathway enrichments for the identified risk factors. Additionally, we examined shared genetic risk between Parkinson's disease and other phenotypes of interest via genetic correlations followed by Mendelian randomisation. Findings Between Oct 1, 2017, and Aug 9, 2018, we analysed 7·8 million single nucleotide polymorphisms in 37 688 cases, 18 618 UK Biobank proxy-cases (ie, individuals who do not have Parkinson's disease but have a first degree relative that does), and 1·4 million controls. We identified 90 independent genome-wide significant risk signals across 78 genomic regions, including 38 novel independent risk signals in 37 loci. These 90 variants explained 16–36% of the heritable risk of Parkinson's disease depending on prevalence. Integrating methylation and expression data within a Mendelian randomisation framework identified putatively associated genes at 70 risk signals underlying GWAS loci for follow-up functional studies. Tissue-specific expression enrichment analyses suggested Parkinson's disease loci were heavily brain-enriched, with specific neuronal cell types being implicated from single cell data. We found significant genetic correlations with brain volumes (false discovery rate-adjusted p=0·0035 for intracranial volume, p=0·024 for putamen volume), smoking status (p=0·024), and educational attainment (p=0·038). Mendelian randomisation between cognitive performance and Parkinson's disease risk showed a robust association (p=8·00 × 10−7). Interpretation These data provide the most comprehensive survey of genetic risk within Parkinson's disease to date, to the best of our knowledge, by revealing many additional Parkinson's disease risk loci, providing a biological context for these risk factors, and showing that a considerable genetic component of this disease remains unidentified. These associations derived from European ancestry datasets will need to be followed-up with more diverse data. Funding The National Institute on Aging at the National Institutes of Health (USA), The Michael J Fox Foundation, and The Parkinson's Foundation (see appendix for full list of funding sources)

    Identification of common genetic risk variants for autism spectrum disorder

    Get PDF
    Autism spectrum disorder (ASD) is a highly heritable and heterogeneous group of neurodevelopmental phenotypes diagnosed in more than 1% of children. Common genetic variants contribute substantially to ASD susceptibility, but to date no individual variants have been robustly associated with ASD. With a marked sample-size increase from a unique Danish population resource, we report a genome-wide association meta-analysis of 18,381 individuals with ASD and 27,969 controls that identified five genome-wide-significant loci. Leveraging GWAS results from three phenotypes with significantly overlapping genetic architectures (schizophrenia, major depression, and educational attainment), we identified seven additional loci shared with other traits at equally strict significance levels. Dissecting the polygenic architecture, we found both quantitative and qualitative polygenic heterogeneity across ASD subtypes. These results highlight biological insights, particularly relating to neuronal function and corticogenesis, and establish that GWAS performed at scale will be much more productive in the near term in ASD

    Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals

    Get PDF
    We conduct a genome-wide association study (GWAS) of educational attainment (EA) in a sample of ~3 million individuals and identify 3,952 approximately uncorrelated genome-wide-significant single-nucleotide polymorphisms (SNPs). A genome-wide polygenic predictor, or polygenic index (PGI), explains 12-16% of EA variance and contributes to risk prediction for ten diseases. Direct effects (i.e., controlling for parental PGIs) explain roughly half the PGI's magnitude of association with EA and other phenotypes. The correlation between mate-pair PGIs is far too large to be consistent with phenotypic assortment alone, implying additional assortment on PGI-associated factors. In an additional GWAS of dominance deviations from the additive model, we identify no genome-wide-significant SNPs, and a separate X-chromosome additive GWAS identifies 57
    corecore