Search CORE

4 research outputs found

Location of the Catalytic Site for Phosphoenolpyruvate Formation within the Primary Structure of Clostridium symbiosum Pyruvate Phosphate Dikinase. 2. Site-Directed Mutagenesis of an Essential Arginine Contained within an Apparent P-Loop

Author: Debra Dunaway-Mariano
Linda Yankie
Yuan Xu
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref

Location of the Catalytic Site for Phosphoenolpyruvate Formation within the Primary Structure of Clostridium symbiosum Pyruvate Phosphate Dikinase. 1. Identification of an Essential Cysteine by Chemical Modification with [1-14C]Bromopyruvate and Site-Directed Mutagenesis

Author: Brian M. Martin
Debra Dunaway-Mariano
Li Shen
Linda Yankie
Patrick S. Mariano
Young-Shik Jung
Yuan Xu
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref

VADR: validation and annotation of virus sequence submissions to GenBank

Author: Brister J. Rodney
Hatcher Eneida L.
Karsch-Mizrachi Ilene
Nawrocki Eric P.
Schäffer Alejandro A.
Shonkwiler Lara
Yankie Linda
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2019
Field of study

Background GenBank contains over 3 million viral sequences. The National Center for Biotechnology Information (NCBI) previously made available a tool for validating and annotating influenza virus sequences that is used to check submissions to GenBank. Before this project, there was no analogous tool in use for non-influenza viral sequence submissions. Results We developed a system called VADR (Viral Annotation DefineR) that validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST. The system identifies 43 types of “alerts” that (unlike the previous BLAST-based system) provide deterministic and rigorous feedback to researchers who submit sequences with unexpected characteristics. VADR has been integrated into GenBank’s submission processing pipeline allowing for viral submissions passing all tests to be accepted and annotated automatically, without the need for any human (GenBank indexer) intervention. Unlike the previous submission-checking system, VADR is freely available (https://github.com/nawrockie/vadr) for local installation and use. VADR has been used for Norovirus submissions since May 2018 and for Dengue virus submissions since January 2019. Since March 2020, VADR has also been used to check SARS-CoV-2 sequence submissions. Other viruses with high numbers of submissions will be added incrementally. Conclusion VADR improves the speed with which non-flu virus submissions to GenBank can be checked and improves the content and quality of the GenBank annotations. The availability and portability of the software allow researchers to run the GenBank checks prior to submitting their viral sequences, and thereby gain confidence that their submissions will be accepted immediately without the need to correspond with GenBank staff. Reciprocally, the adoption of VADR frees GenBank staff to spend more time on services other than checking routine viral sequence submissions

DSpace@MIT

The completion of the Mammalian Gene Collection

Author: Astashyn Alex
Bonner Tom I.
Brown Garth
Buetow Kenneth H.
Collins Francis S.
Derge Jeffrey G.
Farrell Catherine
Feingold Elise A.
Gerhard Daniela S.
Good Peter J.
Hart Jennifer
Jang Wonhee
Landrum Melissa
Lewis Jeanne
Maidak Bonnie L.
Mandich Allison
Misquitta Leonie
Murphy Michael
Murphy Terence
Phan Lon
Rajput Bhanu
Rasooly Rebekah
Riddick Lillian
Robinson Cristen
Schaefer Carl F.
Shenmen Carolyn M.
Shoaf Debonny
Temple Gary
Wagner Lukas
Ward Ming
Yankie Linda
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/12/2009
Field of study

Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide

University of Queensland eSpace