Search CORE

11 research outputs found

16S alignment file

Author: Christopher S. Henry (126951)
Fangfang Xia (126959)
Janaka N. Edirisinghe (506693)
Neal Conrad (3201579)
Pamela Weisenhorn (2933172)
Rick L. Stevens (126957)
Ross Overbeek (11341)
Publication venue
Publication date: 10/08/2016
Field of study

Sequence alignment file that has been used to generate the initial master tree. Based on this master tree, the microbial life tree that was used in this study (16S OTU98.5) was generated using a distance-based clustering algorithm

Dryad Digital Repository (Duke University)

FigShare

Microbial life tree (16S OTU98.5)

Author: Christopher S. Henry (126951)
Fangfang Xia (126959)
Janaka N. Edirisinghe (506693)
Neal Conrad (3201579)
Pamela Weisenhorn (2933172)
Rick L. Stevens (126957)
Ross Overbeek (11341)
Publication venue
Publication date: 10/08/2016
Field of study

Phylogenetic tree (nwk format) used to show pathway conservation analysis on central metabolis

Dryad Digital Repository (Duke University)

FigShare

SEED Servers: High-Performance Access to the SEED Genomes, Annotations, and Metabolic Models

Author: Bruce Parrello (42723)
Christopher S. Henry (126951)
Fangfang Xia (126959)
Gary J. Olsen (126954)
Gordon D. Pusch (11370)
Ramy K. Aziz (79570)
Rick L. Stevens (126957)
Robert A. Edwards (41625)
Robert Olson (11367)
Ross Overbeek (11341)
Scott Devoid (126948)
Terrence Disz (42714)
Veronika Vonstein (11378)
Publication venue
Publication date: 24/10/2012
Field of study

<div>The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers (<a href="http://www.theseed.org/servers">http://www.theseed.org/servers</a>): four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users. </div

Directory of Open Access Journals

PubMed Central

FigShare

Architecture of the SEED servers.

Author: Bruce Parrello (42723)
Christopher S. Henry (126951)
Fangfang Xia (126959)
Gary J. Olsen (126954)
Gordon D. Pusch (11370)
Ramy K. Aziz (79570)
Rick L. Stevens (126957)
Robert A. Edwards (41625)
Robert Olson (11367)
Ross Overbeek (11341)
Scott Devoid (126948)
Terrence Disz (42714)
Veronika Vonstein (11378)
Publication venue
Publication date
Field of study

The client packages (currently available for Perl or Java) handle the HTTP requests and responses, and parse the data from the appropriate lightweight data exchange formats to data structures. The four servers access the SEED data.</p

FigShare

Processing ids_to_sequences.

Author: Bruce Parrello (42723)
Christopher S. Henry (126951)
Fangfang Xia (126959)
Gary J. Olsen (126954)
Gordon D. Pusch (11370)
Ramy K. Aziz (79570)
Rick L. Stevens (126957)
Robert A. Edwards (41625)
Robert Olson (11367)
Ross Overbeek (11341)
Scott Devoid (126948)
Terrence Disz (42714)
Veronika Vonstein (11378)
Publication venue
Publication date
Field of study

(a) The ids_to_sequences function call accepts multiple IDs as an argument and uses the Sapling server to process the calls. These are returned as a single table. (b) A detailed description of each call (in this example, the ids_to_sequences) is provided online and is automatically generated from the entity-relationship models shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0048053#pone-0048053-g002" target="_blank">Figure 2</a>.</p

FigShare

Standardized Metadata for Human Pathogen/Vector Genomic Sequences

Author: Alison Yao (22136)
Brett E. Pickett (579375)
Bruce Birren (18146)
Bruno Sobral (39558)
Cheryl I. Murphy (579388)
Christian J. Stoeckert Jr (579377)
Christina A. Cuomo (122212)
Claire Fraser (2626)
Dan E. Sullivan (579378)
Daniel E. Neafsey (224044)
David Rasko (579390)
David S. Roos (9256)
David Wentworth (179765)
Doyle V. Ward (124675)
Elizabet Caler (579380)
Emmanuel F. Mongodin (117741)
Erin Hine (89046)
Eun Mi Lee (304021)
Frank H. Collins (34131)
Garry Myers (98421)
Gloria I. Giraldo-Calderón (186604)
Hervé Tettelin (2850)
Ilene Karsch-Mizrachi (579386)
Indresh Singh (579379)
Jennifer Wortman (59749)
Jessica C. Kissinger (14009)
Jie Zheng (31208)
Joana C. Silva (81113)
Julia Puzak (579389)
Julie Dunning Hotopp (579385)
Karen E. Nelson (123396)
Lauren Brinkac (466142)
Lisa Sadzewicz (427790)
Luke Tallon (579392)
Lynn M. Schriml (579376)
Maria Giovanni (579384)
Mark Eppinger (78034)
Matthew R. Henn (103220)
Michael Feldgarden (579383)
Omar S. Harb (222220)
Owen White (189)
Punam Mathur (579387)
R. Burke Squires (579391)
Rebecca Will (259239)
Richard H. Scheuermann (67327)
Rick L. Stevens (126957)
Ruchi M. Newman (177165)
Scott Durkin (222392)
Scott J. Emrich (579374)
Sinéad Chapman (579381)
Tanya Barrett (57477)
Timothy B. Stockwell (249318)
Valentina Di Francesco (579382)
Vincent M. Bruno (261355)
Vivien G. Dugan (271820)
W. Florian Fricke (145793)
William C. Nierman (81133)
Yun Zhang (131894)
Publication venue
Publication date: 17/06/2014
Field of study

<div>High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium’s minimal information (MIxS) and NCBI’s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant.</div

Directory of Open Access Journals

PubMed Central

FigShare

Core Project Attributes.

Author: Alison Yao (22136)
Brett E. Pickett (579375)
Bruce Birren (18146)
Bruno Sobral (39558)
Cheryl I. Murphy (579388)
Christian J. Stoeckert Jr (579377)
Christina A. Cuomo (122212)
Claire Fraser (2626)
Dan E. Sullivan (579378)
Daniel E. Neafsey (224044)
David Rasko (579390)
David S. Roos (9256)
David Wentworth (179765)
Doyle V. Ward (124675)
Elizabet Caler (579380)
Emmanuel F. Mongodin (117741)
Erin Hine (89046)
Eun Mi Lee (304021)
Frank H. Collins (34131)
Garry Myers (98421)
Gloria I. Giraldo-Calderón (186604)
Hervé Tettelin (2850)
Ilene Karsch-Mizrachi (579386)
Indresh Singh (579379)
Jennifer Wortman (59749)
Jessica C. Kissinger (14009)
Jie Zheng (31208)
Joana C. Silva (81113)
Julia Puzak (579389)
Julie Dunning Hotopp (579385)
Karen E. Nelson (123396)
Lauren Brinkac (466142)
Lisa Sadzewicz (427790)
Luke Tallon (579392)
Lynn M. Schriml (579376)
Maria Giovanni (579384)
Mark Eppinger (78034)
Matthew R. Henn (103220)
Michael Feldgarden (579383)
Omar S. Harb (222220)
Owen White (189)
Punam Mathur (579387)
R. Burke Squires (579391)
Rebecca Will (259239)
Richard H. Scheuermann (67327)
Rick L. Stevens (126957)
Ruchi M. Newman (177165)
Scott Durkin (222392)
Scott J. Emrich (579374)
Sinéad Chapman (579381)
Tanya Barrett (57477)
Timothy B. Stockwell (249318)
Valentina Di Francesco (579382)
Vincent M. Bruno (261355)
Vivien G. Dugan (271820)
W. Florian Fricke (145793)
William C. Nierman (81133)
Yun Zhang (131894)
Publication venue
Publication date
Field of study

*Mandatory NCBI BioProject attributes.</p

FigShare

Semantic Network of the Core Sample Data Fields.

Author: Alison Yao (22136)
Brett E. Pickett (579375)
Bruce Birren (18146)
Bruno Sobral (39558)
Cheryl I. Murphy (579388)
Christian J. Stoeckert Jr (579377)
Christina A. Cuomo (122212)
Claire Fraser (2626)
Dan E. Sullivan (579378)
Daniel E. Neafsey (224044)
David Rasko (579390)
David S. Roos (9256)
David Wentworth (179765)
Doyle V. Ward (124675)
Elizabet Caler (579380)
Emmanuel F. Mongodin (117741)
Erin Hine (89046)
Eun Mi Lee (304021)
Frank H. Collins (34131)
Garry Myers (98421)
Gloria I. Giraldo-Calderón (186604)
Hervé Tettelin (2850)
Ilene Karsch-Mizrachi (579386)
Indresh Singh (579379)
Jennifer Wortman (59749)
Jessica C. Kissinger (14009)
Jie Zheng (31208)
Joana C. Silva (81113)
Julia Puzak (579389)
Julie Dunning Hotopp (579385)
Karen E. Nelson (123396)
Lauren Brinkac (466142)
Lisa Sadzewicz (427790)
Luke Tallon (579392)
Lynn M. Schriml (579376)
Maria Giovanni (579384)
Mark Eppinger (78034)
Matthew R. Henn (103220)
Michael Feldgarden (579383)
Omar S. Harb (222220)
Owen White (189)
Punam Mathur (579387)
R. Burke Squires (579391)
Rebecca Will (259239)
Richard H. Scheuermann (67327)
Rick L. Stevens (126957)
Ruchi M. Newman (177165)
Scott Durkin (222392)
Scott J. Emrich (579374)
Sinéad Chapman (579381)
Tanya Barrett (57477)
Timothy B. Stockwell (249318)
Valentina Di Francesco (579382)
Vincent M. Bruno (261355)
Vivien G. Dugan (271820)
W. Florian Fricke (145793)
William C. Nierman (81133)
Yun Zhang (131894)
Publication venue
Publication date
Field of study

A semantic representation of the entities relevant to describe infectious disease samples based on the OBI and other OBO Foundry ontologies is shown. Distinctions are made between material entities (blue outlines), information entities and qualities (black outlines), and processes (red outlines). Entities are connected by standard semantic relations, in italic. The subset of entities selected as Core Sample fields are noted with ovals containing the respective Field ID. For example, the OBI:organism has_quality “Specimen Source Gender” (CS5), which is equivalent to the PATO:biological sex, and has_quality PATO:age, and has_quality “Specimen Source Health Status” (CS8), which is equivalent to PATO:organismal status. PATO:age is_quality_measured_as OBI:age since birth measurement datum, which has_measurement_value “Specimen Source Age – Value” (CS6) and has_measurement_unit_label “Specimen Source Age – Unit” (CS7).</p

FigShare

Semantic Network of the Core Project Data Fields.

Author: Alison Yao (22136)
Brett E. Pickett (579375)
Bruce Birren (18146)
Bruno Sobral (39558)
Cheryl I. Murphy (579388)
Christian J. Stoeckert Jr (579377)
Christina A. Cuomo (122212)
Claire Fraser (2626)
Dan E. Sullivan (579378)
Daniel E. Neafsey (224044)
David Rasko (579390)
David S. Roos (9256)
David Wentworth (179765)
Doyle V. Ward (124675)
Elizabet Caler (579380)
Emmanuel F. Mongodin (117741)
Erin Hine (89046)
Eun Mi Lee (304021)
Frank H. Collins (34131)
Garry Myers (98421)
Gloria I. Giraldo-Calderón (186604)
Hervé Tettelin (2850)
Ilene Karsch-Mizrachi (579386)
Indresh Singh (579379)
Jennifer Wortman (59749)
Jessica C. Kissinger (14009)
Jie Zheng (31208)
Joana C. Silva (81113)
Julia Puzak (579389)
Julie Dunning Hotopp (579385)
Karen E. Nelson (123396)
Lauren Brinkac (466142)
Lisa Sadzewicz (427790)
Luke Tallon (579392)
Lynn M. Schriml (579376)
Maria Giovanni (579384)
Mark Eppinger (78034)
Matthew R. Henn (103220)
Michael Feldgarden (579383)
Omar S. Harb (222220)
Owen White (189)
Punam Mathur (579387)
R. Burke Squires (579391)
Rebecca Will (259239)
Richard H. Scheuermann (67327)
Rick L. Stevens (126957)
Ruchi M. Newman (177165)
Scott Durkin (222392)
Scott J. Emrich (579374)
Sinéad Chapman (579381)
Tanya Barrett (57477)
Timothy B. Stockwell (249318)
Valentina Di Francesco (579382)
Vincent M. Bruno (261355)
Vivien G. Dugan (271820)
W. Florian Fricke (145793)
William C. Nierman (81133)
Yun Zhang (131894)
Publication venue
Publication date
Field of study

A semantic representation of the entities relevant to describe infectious disease projects based on the OBI and other OBO Foundry ontologies is shown. Distinctions are made between material entities (blue outlines), information entities and qualities (black outlines), and processes (red outlines). Entities are connected by standard semantic relations, in italic. The subset of entities selected as Core Project fields are noted with ovals containing the respective Field ID. For example, both the “Project Title” (CP1) and “Project ID” (CP2) denote an OBI:Investigation; the “Project Description” (CP3) is_about the same OBI:Investigation.</p

FigShare

Core Sample Attributes.

Author: Alison Yao (22136)
Brett E. Pickett (579375)
Bruce Birren (18146)
Bruno Sobral (39558)
Cheryl I. Murphy (579388)
Christian J. Stoeckert Jr (579377)
Christina A. Cuomo (122212)
Claire Fraser (2626)
Dan E. Sullivan (579378)
Daniel E. Neafsey (224044)
David Rasko (579390)
David S. Roos (9256)
David Wentworth (179765)
Doyle V. Ward (124675)
Elizabet Caler (579380)
Emmanuel F. Mongodin (117741)
Erin Hine (89046)
Eun Mi Lee (304021)
Frank H. Collins (34131)
Garry Myers (98421)
Gloria I. Giraldo-Calderón (186604)
Hervé Tettelin (2850)
Ilene Karsch-Mizrachi (579386)
Indresh Singh (579379)
Jennifer Wortman (59749)
Jessica C. Kissinger (14009)
Jie Zheng (31208)
Joana C. Silva (81113)
Julia Puzak (579389)
Julie Dunning Hotopp (579385)
Karen E. Nelson (123396)
Lauren Brinkac (466142)
Lisa Sadzewicz (427790)
Luke Tallon (579392)
Lynn M. Schriml (579376)
Maria Giovanni (579384)
Mark Eppinger (78034)
Matthew R. Henn (103220)
Michael Feldgarden (579383)
Omar S. Harb (222220)
Owen White (189)
Punam Mathur (579387)
R. Burke Squires (579391)
Rebecca Will (259239)
Richard H. Scheuermann (67327)
Rick L. Stevens (126957)
Ruchi M. Newman (177165)
Scott Durkin (222392)
Scott J. Emrich (579374)
Sinéad Chapman (579381)
Tanya Barrett (57477)
Timothy B. Stockwell (249318)
Valentina Di Francesco (579382)
Vincent M. Bruno (261355)
Vivien G. Dugan (271820)
W. Florian Fricke (145793)
William C. Nierman (81133)
Yun Zhang (131894)
Publication venue
Publication date
Field of study

*Mandatory NCBI BioSample attributes in the “Pathogen: clinical or host-associated” version 1.0 package.</p

FigShare