35 research outputs found

    RNAcentral 2021: secondary structure integration, improved sequence search and new member databases

    Get PDF
    RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community

    RNAcentral 2021: secondary structure integration, improved sequence search and new member databases.

    Get PDF
    RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community. RNAcentral is freely available at https://rnacentral.org

    Genetic effects on gene expression across human tissues

    Get PDF
    Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of diseas

    Genetic effects on gene expression across human tissues

    Get PDF
    Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease

    Cancer evolution is associated with pervasive positive selection on globally expressed genes.

    No full text
    Cancer is an evolutionary process in which cells acquire new transformative, proliferative and metastatic capabilities. A full understanding of cancer requires learning the dynamics of the cancer evolutionary process. We present here a large-scale analysis of the dynamics of this evolutionary process within tumors, with a focus on breast cancer. We show that the cancer evolutionary process differs greatly from organismal (germline) evolution. Organismal evolution is dominated by purifying selection (that removes mutations that are harmful to fitness). In contrast, in the cancer evolutionary process the dominance of purifying selection is much reduced, allowing for a much easier detection of the signals of positive selection (adaptation). We further show that, as a group, genes that are globally expressed across human tissues show a very strong signal of positive selection within tumors. Indeed, known cancer genes are enriched for global expression patterns. Yet, positive selection is prevalent even on globally expressed genes that have not yet been associated with cancer, suggesting that globally expressed genes are enriched for yet undiscovered cancer related functions. We find that the increased positive selection on globally expressed genes within tumors is not due to their expression in the tissue relevant to the cancer. Rather, such increased adaptation is likely due to globally expressed genes being enriched in important housekeeping and essential functions. Thus, our results suggest that tumor adaptation is most often mediated through somatic changes to those genes that are important for the most basic cellular functions. Together, our analysis reveals the uniqueness of the cancer evolutionary process and the particular importance of globally expressed genes in driving cancer initiation and progression

    Comparative analysis of human tissue interactomes reveals factors leading to tissue-specific manifestation of hereditary diseases.

    No full text
    An open question in human genetics is what underlies the tissue-specific manifestation of hereditary diseases, which are caused by genomic aberrations that are present in cells across the human body. Here we analyzed this phenomenon for over 300 hereditary diseases by using comparative network analysis. We created an extensive resource of protein expression and interactions in 16 main human tissues, by integrating recent data of gene and protein expression across tissues with data of protein-protein interactions (PPIs). The resulting tissue interaction networks (interactomes) shared a large fraction of their proteins and PPIs, and only a small fraction of them were tissue-specific. Applying this resource to hereditary diseases, we first show that most of the disease-causing genes are widely expressed across tissues, yet, enigmatically, cause disease phenotypes in few tissues only. Upon testing for factors that could lead to tissue-specific vulnerability, we find that disease-causing genes tend to have elevated transcript levels and increased number of tissue-specific PPIs in their disease tissues compared to unaffected tissues. We demonstrate through several examples that these tissue-specific PPIs can highlight disease mechanisms, and thus, owing to their small number, provide a powerful filter for interrogating disease etiologies. As two thirds of the hereditary diseases are associated with these factors, comparative tissue analysis offers a meaningful and efficient framework for enhancing the understanding of the molecular basis of hereditary diseases

    Globally expressed genes are enriched for functional BrCa somatic substitutions compared to genes that are not globally expressed.

    No full text
    a<p>According to the Catalogue of Somatic Mutations in Cancer (COSMIC) <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1004239#pgen.1004239-Forbes1" target="_blank">[33]</a></p

    TS-PPIs illuminate disease-related tissue-specific effects of causal genes.

    No full text
    <p>Orange, blue and grey nodes denote tissue-specific, globally-expressed, and other proteins, respectively; diamond nodes mark hereditary disease genes; edges denote PPIs. A. BRCA1 is a globally-expressed tumor-suppressor hub, and ESR1 is an estrogen receptor protein that activates cellular proliferation. The breast-specific PPI linking BRCA1 and ESR1 provides a potential basis for the breast-specific effects of BRCA1 germline mutations <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003632#pcbi.1003632-Rosen1" target="_blank">[44]</a>. B. A lung-specific PPI connects the widely-expressed epidermal growth factor receptor EGFR and its ligand protein epiregulin (EREG). Germline mutations in EGFR lead to lung cancer <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003632#pcbi.1003632-Centeno1" target="_blank">[30]</a>, and EREG was shown to confer invasive properties in an EGFR-dependent manner <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003632#pcbi.1003632-Zhang1" target="_blank">[31]</a>. C. Muscle-specific PPIs connect the widely expressed trans-membrane cell adhesion receptor dystroglycan 1 (DAG1) to its muscle-specific ligand dystrophin (DMD), and to caveolin 3 (CAV3) which regulates DMD by preventing the DAG1-DMD PPI. Mutations in all three genes give rise to various forms of muscular dystrophies. D. The brain-specific PPIs that link members of the globally-expressed protein complex EIF2B to the netrin-1-receptor DCC may underlie the brain-specific effects of germline mutations in EIF2B complex members <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003632#pcbi.1003632-Tcherkezian1" target="_blank">[35]</a>.</p

    Cancer-associated genes tend to more frequently be globally expressed, and less frequently be expressed in a tissue specific manner than other genes.

    No full text
    <p>Genes that are known to be associated with cancer (black) and all remaining genes (gray) were grouped based on the number of tissues in which their expression has been detected (out of 16 examined tissues). The frequency of genes within each bin is depicted. Cancer genes display a significant (<i>P</i><0.0001, according to a χ<sup>2</sup> test) enrichment for global expression patterns (defined as expression across all 16 examined tissues). At the same time, cancer associated genes are ∼2.5 times less likely than other genes to not be expressed in any tissue, or be expressed in a tissue specific manner (1–3 tissues, a significant depletion, <i>P</i><0.0001).</p
    corecore