32 research outputs found

    Analysis of 3.5 Million SARS-CoV-2 Sequences Reveals Unique Mutational Trends with Consistent Nucleotide and Codon Frequencies

    Get PDF
    Background: Since the onset of the SARS-CoV-2 pandemic, bioinformatic analyses have been performed to understand the nucleotide and synonymous codon usage features and mutational patterns of the virus. However, comparatively few have attempted to perform such analyses on a considerably large cohort of viral genomes while organizing the plethora of available sequence data for a month-by-month analysis to observe changes over time. Here, we aimed to perform sequence composition and mutation analysis of SARS-CoV-2, separating sequences by gene, clade, and timepoints, and contrast the mutational profile of SARS-CoV-2 to other comparable RNA viruses.Methods: Using a cleaned, filtered, and pre-aligned dataset of over 3.5 million sequences downloaded from the GISAID database, we computed nucleotide and codon usage statistics, including calculation of relative synonymous codon usage values. We then calculated codon adaptation index (CAI) changes and a nonsynonymous/synonymous mutation ratio (dN/dS) over time for our dataset. Finally, we compiled information on the types of mutations occurring for SARS-CoV-2 and other comparable RNA viruses, and generated heatmaps showing codon and nucleotide composition at high entropy positions along the Spike sequence.Results: We show that nucleotide and codon usage metrics remain relatively consistent over the 32-month span, though there are significant differences between clades within each gene at various timepoints. CAI and dN/dS values vary substantially between different timepoints and different genes, with Spike gene on average showing both the highest CAI and dN/dS values. Mutational analysis showed that SARS-CoV-2 Spike has a higher proportion of nonsynonymous mutations than analogous genes in other RNA viruses, with nonsynonymous mutations outnumbering synonymous ones by up to 20:1. However, at several specific positions, synonymous mutations were overwhelmingly predominant.Conclusions: Our multifaceted analysis covering both the composition and mutation signature of SARS-CoV-2 gives valuable insight into the nucleotide frequency and codon usage heterogeneity of SARS-CoV-2 over time, and its unique mutational profile compared to other RNA viruses

    The Genome Assembly Archive: A New Public Resource

    Get PDF
    With the genome assembly archive, it is possible to examine the raw data that underlies the DNA sequence in any sequenced genom

    The Consensus Coding Sequence (Ccds) Project: Identifying a Common Protein-Coding Gene Set for the Human and Mouse Genomes

    Get PDF
    Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.National Human Genome Research Institute (U.S.) (Grant number 1U54HG004555-01)Wellcome Trust (London, England) (Grant number WT062023)Wellcome Trust (London, England) (Grant number WT077198

    Database resources of the National Center for Biotechnology Information

    Get PDF
    In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data retrieval systems and computational resources for the analysis of data in GenBank and other biological data made available through NCBI's website. NCBI resources include Entrez, Entrez Programming Utilities, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov

    Database resources of the National Center for Biotechnology Information

    Get PDF
    In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, Reference Sequence, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Peptidome, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov

    Lineage-Specific Biology Revealed by a Finished Genome Assembly of the Mouse

    Get PDF
    A finished clone-based assembly of the mouse genome reveals extensive recent sequence duplication during recent evolution and rodent-specific expansion of certain gene families. Newly assembled duplications contain protein-coding genes that are mostly involved in reproductive function

    Iatrogenic intraosseous ganglion following ligament reconstruction and tendon interposition: A case report

    No full text
    Introduction: We present a large intraosseous ganglion related to hardware migration after ligament reconstruction and tendon interposition (LRTI). Case details: A 51-year-old female underwent LRTI after four years of conservative management for basal joint arthritis. The Kirschner wire (“K-wire”) migrated and was freely mobile for approximately 18 days. The K-wire likely traumatized the first metacarpal, interossei, and radial nerve. Diagnostic imaging revealed a ganglion that occupied approximately 80% of the volume of the first metacarpal. Wary of additional complications, the patient deferred excisional curettage and bone grafting. At 30 months post-operatively, there was persistent pressure and pain within the first metacarpal bone and slight improvement of carpometacarpal arthritis symptoms. Conclusion: To the best of our knowledge, this is the first report of a large intraosseous ganglion associated with hardware migration after LRTI. In a patient who is a reliable historian and has no secondary gain issues, the development of new or worsening symptoms warrants a thorough evaluation. Careful review of clinical and radiographic evidence may identify a specific cause and help to guide management. Even if patients decline to have further surgery, they may appreciate having a diagnosis—intraosseous ganglion—and not being dismissed as “failed hand surgery syndrome.
    corecore