109 research outputs found

    Functional Analysis of Human Long Non-coding RNAs and Their Associations with Diseases

    Get PDF
    Within this study, we sought to leverage knowledge from well-characterized protein coding genes to characterize the lesser known long non-coding RNA (lncRNA) genes using computational methods to find functional annotations and disease associations. Functional genome annotation is an essential step to a systems-level view of the human genome. With this knowledge, we can gain a deeper understanding of how humans develop and function, and a better understanding of human disease. LncRNAs are transcripts greater than 200 nucleotides, which do not code for proteins. LncRNAs have been found to regulate development, tissue and cell differentiation, and organ formation. Their dysregulation has been linked to several diseases including autism spectrum disorder (ASD) and cancer. While a great deal of research has been dedicated to protein-coding genes, the relatively recently discovered lncRNA genes have yet to be characterized. LncRNA function is tied closely to when and where they are expressed. Co-expression network analysis offer a means of functional annotation of uncharacterized genes through a guilt by association approach. We have constructed two co-expression networks using known disease-associated protein-coding genes and lncRNA genes. Through clustering of the networks, gene set enrichment analysis, and centrality measures, we found enrichment for disease association and functions as well as identified high-confidence lncRNA disease gene targets. We present a novel approach to the identification of disease state associations by demonstrating genes that are associated with the same disease states share patterns that can be discerned from transcriptomes of healthy tissues. Using a machine learning algorithm, we built a model to classify ASD versus non-ASD genes using their expression profiles from healthy developing human brain tissues. Feature selection during the model-building process also identified critical temporospatial points for the determination of ASD genes. We constructed a webserver tool for the prioritization of genes for ASD association. The webserver tool has a database containing prioritization and co-expression information for nearly every gene in the human genome

    Exploiting single-cell expression to characterize co-expression replicability

    Get PDF

    DCC gene network in the prefrontal cortex is associated with total brain volume in childhood

    Get PDF
    BACKGROUND: Genetic variation in the guidance cue DCC gene is linked to psychopathologies involving dysfunction in the prefrontal cortex. We created an expression-based polygenic risk score (ePRS) based on the DCC coexpression gene network in the prefrontal cortex, hypothesizing that it would be associated with individual differences in total brain volume. METHODS: We filtered single nucleotide polymorphisms (SNPs) from genes coexpressed with DCC in the prefrontal cortex obtained from an adult postmortem donors database (BrainEAC) for genes enriched in children 1.5 to 11 years old (BrainSpan). The SNPs were weighted by their effect size in predicting gene expression in the prefrontal cortex, multiplied by their allele number based on an individual's genotype data, and then summarized into an ePRS. We evaluated associations between the DCC ePRS and total brain volume in children in 2 community-based cohorts: the Maternal Adversity, Vulnerability and Neurodevelopment (MAVAN) and University of California, Irvine (UCI) projects. For comparison, we calculated a conventional PRS based on a genome-wide association study of total brain volume. RESULTS: Higher ePRS was associated with higher total brain volume in children 8 to 10 years old (β = 0.212, p = 0.043; n = 88). The conventional PRS at several different thresholds did not predict total brain volume in this cohort. A replication analysis in an independent cohort of newborns from the UCI study showed an association between the ePRS and newborn total brain volume (β = 0.101, p = 0.048; n = 80). The genes included in the ePRS demonstrated high levels of coexpression throughout the lifespan and are primarily involved in regulating cellular function. LIMITATIONS: The relatively small sample size and age differences between the main and replication cohorts were limitations. CONCLUSION: Our findings suggest that the DCC coexpression network in the prefrontal cortex is critically involved in whole brain development during the first decade of life. Genes comprising the ePRS are involved in gene translation control and cell adhesion, and their expression in the prefrontal cortex at different stages of life provides a snapshot of their dynamic recruitment

    The genetic architecture of language functional connectivity

    Get PDF
    Available online 18 December 2021Language is a unique trait of the human species, of which the genetic architecture remains largely unknown. Through language disorders studies, many candidate genes were identified. However, such complex and multi- factorial trait is unlikely to be driven by only few genes and case-control studies, suffering from a lack of power, struggle to uncover significant variants. In parallel, neuroimaging has significantly contributed to the under- standing of structural and functional aspects of language in the human brain and the recent availability of large scale cohorts like UK Biobank have made possible to study language via image-derived endophenotypes in the general population. Because of its strong relationship with task-based fMRI (tbfMRI) activations and its easiness of acquisition, resting-state functional MRI (rsfMRI) have been more popularised, making it a good surrogate of functional neuronal processes. Taking advantage of such a synergistic system by aggregating effects across spa- tially distributed traits, we performed a multivariate genome-wide association study (mvGWAS) between genetic variations and resting-state functional connectivity (FC) of classical brain language areas in the inferior frontal (pars opercularis, triangularis and orbitalis), temporal and inferior parietal lobes (angular and supramarginal gyri), in 32,186 participants from UK Biobank. Twenty genomic loci were found associated with language FCs, out of which three were replicated in an independent replication sample. A locus in 3p11.1, regulating EPHA3 gene expression, is found associated with FCs of the semantic component of the language network, while a lo- cus in 15q14, regulating THBS1 gene expression is found associated with FCs of the perceptual-motor language processing, bringing novel insights into the neurobiology of language.This research was conducted using the UK Biobank resource un- der application #64984. This project was supported by the Marie Sklodowska-Curie program awarded to Stephanie J. Forkel (Grant agree- ment No. 101028551). Amaia Carrion-Castillo was supported by a Juan de la Cierva fellowship from the Spanish Ministry of Science and Innova- tion, and a Gipuzkoa Fellows fellowship from the Basque Governmen

    Full-length isoform transcriptome of the developing human brain provides further insights into autism.

    Get PDF
    Alternative splicing plays an important role in brain development, but its global contribution to human neurodevelopmental diseases (NDDs) requires further investigation. Here we examine the relationships between splicing isoform expression in the brain and de novo loss-of-function mutations from individuals with NDDs. We analyze the full-length isoform transcriptome of the developing human brain and observe differentially expressed isoforms and isoform co-expression modules undetectable by gene-level analyses. These isoforms are enriched in loss-of-function mutations and microexons, are co-expressed with a unique set of partners, and have higher prenatal expression. We experimentally test the effect of splice-site mutations and demonstrate exon skipping in five NDD risk genes, including SCN2A, DYRK1A, and BTRC. Our results suggest that the splice site mutation in BTRC reduces translational efficiency, likely affecting Wnt signaling through impaired degradation of β-catenin. We propose that functional effects of mutations should be investigated at the isoform- rather than gene-level resolution

    Cell cycle networks link gene expression dysregulation, mutation, and brain maldevelopment in autistic toddlers

    Get PDF
    Genetic mechanisms underlying abnormal early neural development in toddlers with Autism Spectrum Disorder (ASD) remain uncertain due to the impossibility of direct brain gene expression measurement during critical periods of early development. Recent findings from a multi‐tissue study demonstrated high expression of many of the same gene networks between blood and brain tissues, in particular with cell cycle functions. We explored relationships between blood gene expression and total brain volume (TBV) in 142 ASD and control male toddlers. In control toddlers, TBV variation significantly correlated with cell cycle and protein folding gene networks, potentially impacting neuron number and synapse development. In ASD toddlers, their correlations with brain size were lost as a result of considerable changes in network organization, while cell adhesion gene networks significantly correlated with TBV variation. Cell cycle networks detected in blood are highly preserved in the human brain and are upregulated during prenatal states of development. Overall, alterations were more pronounced in bigger brains. We identified 23 candidate genes for brain maldevelopment linked to 32 genes frequently mutated in ASD. The integrated network includes genes that are dysregulated in leukocyte and/or postmortem brain tissue of ASD subjects and belong to signaling pathways regulating cell cycle G1/S and G2/M phase transition. Finally, analyses of the CHD8 subnetwork and altered transcript levels from an independent study of CHD8 suppression further confirmed the central role of genes regulating neurogenesis and cell adhesion processes in ASD brain maldevelopment

    Exploiting single-cell expression to characterize co-expression replicability

    Get PDF
    BACKGROUND: Co-expression networks have been a useful tool for functional genomics, providing important clues about the cellular and biochemical mechanisms that are active in normal and disease processes. However, co-expression analysis is often treated as a black box with results being hard to trace to their basis in the data. Here, we use both published and novel single-cell RNA sequencing (RNA-seq) data to understand fundamental drivers of gene-gene connectivity and replicability in co-expression networks. RESULTS: We perform the first major analysis of single-cell co-expression, sampling from 31 individual studies. Using neighbor voting in cross-validation, we find that single-cell network connectivity is less likely to overlap with known functions than co-expression derived from bulk data, with functional variation within cell types strongly resembling that also occurring across cell types. To identify features and analysis practices that contribute to this connectivity, we perform our own single-cell RNA-seq experiment of 126 cortical interneurons in an experimental design targeted to co-expression. By assessing network replicability, semantic similarity and overall functional connectivity, we identify technical factors influencing co-expression and suggest how they can be controlled for. Many of the technical effects we identify are expression-level dependent, making expression level itself highly predictive of network topology. We show this occurs generally through re-analysis of the BrainSpan RNA-seq data. CONCLUSIONS: Technical properties of single-cell RNA-seq data create confounds in co-expression networks which can be identified and explicitly controlled for in any supervised analysis. This is useful both in improving co-expression performance and in characterizing single-cell data in generally applicable terms, permitting cross-laboratory comparison within a common framework
    corecore