24 research outputs found

    Selenoprotein gene nomenclature

    Get PDF
    The human genome contains 25 genes coding for selenocysteine-containing proteins (selenoproteins). These proteins are involved in a variety of functions, most notably redox homeostasis. Selenoprotein enzymes with known functions are designated according to these functions: TXNRD1, TXNRD2, and TXNRD3 (thioredoxin reductases), GPX1, GPX2, GPX3, GPX4 and GPX6 (glutathione peroxidases), DIO1, DIO2, and DIO3 (iodothyronine deiodinases), MSRB1 (methionine-R-sulfoxide reductase 1) and SEPHS2 (selenophosphate synthetase 2). Selenoproteins without known functions have traditionally been denoted by SEL or SEP symbols. However, these symbols are sometimes ambiguous and conflict with the approved nomenclature for several other genes. Therefore, there is a need to implement a rational and coherent nomenclature system for selenoprotein-encoding genes. Our solution is to use the root symbol SELENO followed by a letter. This nomenclature applies to SELENOF (selenoprotein F, the 15 kDa selenoprotein, SEP15), SELENOH (selenoprotein H, SELH, C11orf31), SELENOI (selenoprotein I, SELI, EPT1), SELENOK (selenoprotein K, SELK), SELENOM (selenoprotein M, SELM), SELENON (selenoprotein N, SEPN1, SELN), SELENOO (selenoprotein O, SELO), SELENOP (selenoprotein P, SeP, SEPP1, SELP), SELENOS (selenoprotein S, SELS, SEPS1, VIMP), SELENOT (selenoprotein T, SELT), SELENOV (selenoprotein V, SELV) and SELENOW (selenoprotein W, SELW, SEPW1). This system, approved by the HUGO Gene Nomenclature Committee, also resolves conflicting, missing and ambiguous designations for selenoprotein genes and is applicable to selenoproteins across vertebrates

    RNAcentral: A vision for an international database of RNA sequences

    Get PDF
    During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor

    Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation.

    Get PDF
    The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. Nucleic Acids Res 2018 Jan 4; 46(D1):D221-D228

    Pharmacogenetic allele nomenclature: International workgroup recommendations for test result reporting

    Get PDF
    This manuscript provides nomenclature recommendations developed by an international workgroup to increase transparency and standardization of pharmacogenetic (PGx) result reporting. Presently, sequence variants identified by PGx tests are described using different nomenclature systems. In addition, PGx analysis may detect different sets of variants for each gene, which can affect interpretation of results. This practice has caused confusion and may thereby impede the adoption of clinical PGx testing. Standardization is critical to move PGx forward

    RNAcentral 2021: secondary structure integration, improved sequence search and new member databases

    Get PDF
    RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community

    RNAcentral 2021: secondary structure integration, improved sequence search and new member databases.

    Get PDF
    RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community. RNAcentral is freely available at https://rnacentral.org

    RNAcentral : a hub of information for non-coding RNA sequences

    Get PDF
    RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences, collating information on ncRNA sequences of all types from a broad range of organisms. We have recently added a new genome mapping pipeline that identifies genomic locations for ncRNA sequences in 296 species. We have also added several new types of functional annotations, such as tRNA secondary structures, Gene Ontology annotations, and miRNA-target interactions. A new quality control mechanism based on Rfam family assignments identifies potential contamination, incomplete sequences, and more. The RNAcentral database has become a vital component of many workflows in the RNA community, serving as both the primary source of sequence data for academic and commercial groups, as well as a source of stable accessions for the annotation of genomic and functional features. These examples are facilitated by an improved RNAcentral web interface, which features an updated genome browser, a new sequence feature viewer, and improved text search functionality. RNAcentral is freely available at https://rnacentral.org

    Classification and nomenclature of all human homeobox genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The homeobox genes are a large and diverse group of genes, many of which play important roles in the embryonic development of animals. Increasingly, homeobox genes are being compared between genomes in an attempt to understand the evolution of animal development. Despite their importance, the full diversity of human homeobox genes has not previously been described.</p> <p>Results</p> <p>We have identified all homeobox genes and pseudogenes in the euchromatic regions of the human genome, finding many unannotated, incorrectly annotated, unnamed, misnamed or misclassified genes and pseudogenes. We describe 300 human homeobox loci, which we divide into 235 probable functional genes and 65 probable pseudogenes. These totals include 3 genes with partial homeoboxes and 13 pseudogenes that lack homeoboxes but are clearly derived from homeobox genes. These figures exclude the repetitive <it>DUX1 </it>to <it>DUX5 </it>homeobox sequences of which we identified 35 probable pseudogenes, with many more expected in heterochromatic regions. Nomenclature is established for approximately 40 formerly unnamed loci, reflecting their evolutionary relationships to other loci in human and other species, and nomenclature revisions are proposed for around 30 other loci. We use a classification that recognizes 11 homeobox gene 'classes' subdivided into 102 homeobox gene 'families'.</p> <p>Conclusion</p> <p>We have conducted a comprehensive survey of homeobox genes and pseudogenes in the human genome, described many new loci, and revised the classification and nomenclature of homeobox genes. The classification scheme may be widely applicable to homeobox genes in other animal genomes and will facilitate comparative genomics of this important gene superclass.</p

    The ABCs of membrane transporters in health and disease (SLC series): introduction

    No full text
    The field of transport biology has steadily grown over the past decade and is now recognized as playing an important role in manifestation and treatment of disease. The SLC (solute carrier) gene series has grown to now include 52 families and 395 transporter genes in the human genome. A list of these genes can be found at the HUGO Gene Nomenclature Committee (HGNC) website (see www.genenames.org/genefamilies/SLC). This special issue features mini-reviews for each of these SLC families written by the experts in each field. The existing online resource for solute carriers, the Bioparadigms SLC Tables (www.bioparadigms.org), has been updated and significantly extended with additional information and cross-links to other relevant databases, and the nomenclature used in this database has been validated and approved by the HGNC. In addition, the Bioparadigms SLC Tables functionality has been improved to allow easier access by the scientific community. This introduction includes: an overview of all known SLC and "non-SLC" transporter genes; a list of transporters of water soluble vitamins; a summary of recent progress in the structure determination of transporters (including GLUT1/SLC2A1); roles of transporters in human diseases and roles in drug approval and pharmaceutical perspectives
    corecore