Search CORE

65 research outputs found

Studies in biochemistry--steroid synthesis, P-450 enzymology and renal lithogenesis

Author: Selengut Jeremy D. (Jeremy David)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1994
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Chemistry, 1994.Includes bibliographical references.by Jeremy D. Selengut.Ph.D

DSpace@MIT

Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function

Author: Haft Daniel H
Rusch Douglas B
Selengut Jeremy D
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

PubMed Central

InterPro: the integrative protein signature database

The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or ‘signatures' representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY and TIGRFAMs. Integration is performed manually and approximately half of the total ∼58 000 signatures available in the source databases belong to an InterPro entry. Recently, we have started to also display the remaining un-integrated signatures via our web interface. Other developments include the provision of non-signature data, such as structural data, in new XML files on our FTP site, as well as the inclusion of matchless UniProtKB proteins in the existing match XML files. The web interface has been extended and now links out to the ADAN predicted protein-protein interaction database and the SPICE and Dasty viewers. The latest public release (v18.0) covers 79.8% of UniProtKB (v14.1) and consists of 16 549 entries. InterPro data may be accessed either via the web address above, via web services, by downloading files by anonymous FTP or by using the InterProScan search software (http://www.ebi.ac.uk/Tools/InterProScan/

RERO DOC Digital Library

TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes

Author: Davidsen Tanja
Ganapathy Anurhada
Gwinn-Giglio Michelle
Haft Daniel H.
Nelson William C.
Richter Alexander R.
Selengut Jeremy D.
White Owen
Publication venue: Oxford University Press
Publication date: 06/12/2006
Field of study

TIGRFAMs is a collection of protein family definitions built to aid in high-throughput annotation of specific protein functions. Each family is based on a hidden Markov model (HMM), where both cutoff scores and membership in the seed alignment are chosen so that the HMMs can classify numerous proteins according to their specific molecular functions. Most TIGRFAMs models describe ‘equivalog’ families, where both orthology and lateral gene transfer may be part of the evolutionary history, but where a single molecular function has been conserved. The Genome Properties system contains a queriable set of metabolic reconstructions, genome metrics and extractions of information from the scientific literature. Its genome-by-genome assertions of whether or not specific structures, pathways or systems are present provide high-level conceptual descriptions of genomic content. These assertions enable comparative genomics, provide a meaningful biological context to aid in manual annotation, support assignments of Gene Ontology (GO) biological process terms and help validate HMM-based predictions of protein function. The Genome Properties system is particularly useful as a generator of phylogenetic profiles, through which new protein family functions may be discovered. The TIGRFAMs and Genome Properties systems can be accessed at and

CiteSeerX

Crossref

PubMed Central

InterPro in 2011: new developments in the family and domain prediction database

InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interface

RERO DOC Digital Library

ProPhylo: partial phylogenetic profiling to guide protein family construction and assignment of biological process

Author: CJ Stubben
D Barker
D Barker
D Haft
D Szklarczyk
DA Rodionov
Daniel H Haft
DH Haft
DH Haft
DH Haft
EM Marcotte
F Eckstein
F Enault
GV Glazko
H-Y Ou
J Sun
J Wu
J-P Vert
JAG Ranea
JD Selengut
JD Selengut
JD Selengut
Jeremy D Selengut
L Ferrer
M Csurös
M Huynen
M Pellegrini
MA Huynen
Malay K Basu
MS Gelfand
P Pagel
PM Bowers
PR Kensche
PS Dehal
R Jothi
RL Tatusov
S Briesemeister
S Freilich
SR Eddy
SV Date
SV Date
T Blum
T Gaasterland
T Xu
T Yamada
X Brazzolotto
Y Hong
Y Liu
Y Zhou
Z Jiang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Comparative Genomics of Emerging Human Ehrlichiosis Agents

Anaplasma (formerly Ehrlichia) phagocytophilum, Ehrlichia chaffeensis, and Neorickettsia (formerly Ehrlichia) sennetsu are intracellular vector-borne pathogens that cause human ehrlichiosis, an emerging infectious disease. We present the complete genome sequences of these organisms along with comparisons to other organisms in the Rickettsiales order. Ehrlichia spp. and Anaplasma spp. display a unique large expansion of immunodominant outer membrane proteins facilitating antigenic variation. All Rickettsiales have a diminished ability to synthesize amino acids compared to their closest free-living relatives. Unlike members of the Rickettsiaceae family, these pathogenic Anaplasmataceae are capable of making all major vitamins, cofactors, and nucleotides, which could confer a beneficial role in the invertebrate vector or the vertebrate host. Further analysis identified proteins potentially involved in vacuole confinement of the Anaplasmataceae, a life cycle involving a hematophagous vector, vertebrate pathogenesis, human pathogenesis, and lack of transovarial transmission. These discoveries provide significant insights into the biology of these obligate intracellular pathogens

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Correction: Comparative Genomics of Emerging Human Ehrlichiosis Agents

Crossref

Directory of Open Access Journals

PubMed Central

Exopolysaccharide-associated protein sorting in environmental organisms: the PEP-CTERM/EpsH system. Application of a novel phylogenetic profiling heuristic

Author: A Bateman
A Krogh
A Ruzin
AP Pugsley
AR Panchenko
AS Juncker
B Thony
BC Berks
CL Santini
D Comfort
DA Rodionov
Daniel H Haft
DH Haft
DH Haft
DH Haft
DH Haft
DJ Studholme
DJ Studholme
DT Jones
EL Sonnhammer
EM Marcotte
G von Heijne
GE Crooks
H Andersson
H Ogata
H Tettelin
H Ton-That
H Wu
I Uchiyama
Ian T Paulsen
J Drummelsmith
J Pei
JC Venter
JD Bendtsen
JD Peterson
JD Thompson
Jeremy D Selengut
JF Heidelberg
K Tharakaraman
LA Marraffini
LJ McGuffin
M Buck
M Pellegrini
MC Frith
Naomi Ward
RC Edgar
RD Finn
RL Tatusov
SF Altschul
SG Lee
SG Tringe
SP Iyer
T Bae
T Yoshida
TS Mikkelsen
WW Navarre
Y Zong
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Protein translocation to the proper cellular destination may be guided by various classes of sorting signals recognizable in the primary sequence. Detection in some genomes, but not others, may reveal sorting system components by comparison of the phylogenetic profile of the class of sorting signal to that of various protein families. RESULTS: We describe a short C-terminal homology domain, sporadically distributed in bacteria, with several key characteristics of protein sorting signals. The domain includes a near-invariant motif Pro-Glu-Pro (PEP). This possible recognition or processing site is followed by a predicted transmembrane helix and a cluster rich in basic amino acids. We designate this domain PEP-CTERM. It tends to occur multiple times in a genome if it occurs at all, with a median count of eight instances; Verrucomicrobium spinosum has sixty-five. PEP-CTERM-containing proteins generally contain an N-terminal signal peptide and exhibit high diversity and little homology to known proteins. All bacteria with PEP-CTERM have both an outer membrane and exopolysaccharide (EPS) production genes. By a simple heuristic for screening phylogenetic profiles in the absence of pre-formed protein families, we discovered that a homolog of the membrane protein EpsH (exopolysaccharide locus protein H) occurs in a species when PEP-CTERM domains are found. The EpsH family contains invariant residues consistent with a transpeptidase function. Most PEP-CTERM proteins are encoded by single-gene operons preceded by large intergenic regions. In the Proteobacteria, most of these upstream regions share a DNA sequence, a probable cis-regulatory site that contains a sigma-54 binding motif. The phylogenetic profile for this DNA sequence exactly matches that of three proteins: a sigma-54-interacting response regulator (PrsR), a transmembrane histidine kinase (PrsK), and a TPR protein (PrsT). CONCLUSION: These findings are consistent with the hypothesis that PEP-CTERM and EpsH form a protein export sorting system, analogous to the LPXTG/sortase system of Gram-positive bacteria, and correlated to EPS expression. It occurs preferentially in bacteria from sediments, soils, and biofilms. The novel method that led to these findings, partial phylogenetic profiling, requires neither global sequence clustering nor arbitrary similarity cutoffs and appears to be a rapid, effective alternative to other profiling methods

Crossref

Directory of Open Access Journals

PubMed Central

Research from Macquarie University

Genomic insights to SAR86, an abundant and uncultivated marine bacterial lineage

Author: A Dufresne
A Hasona
A Raghunathan
A Stamatakis
Aaron L Halpern
AB Roy
AH Treusch
BAS Van Mooy
BAS Van Mooy
Chris L Dupont
CL Dupont
Dan H Haft
DB Rusch
DB Rusch
DH Haft
DJ Scanlan
Douglas B Rusch
EC Howard
EF DeLong
EP Nawrocki
F Azam
FB Dean
FB Dean
G Sabehi
G Sabehi
H Teeling
HJ Tripp
I Mary
J Craig Venter
J McCarren
JC Venter
Jeremy D Selengut
JI Hedges
JM Gonzalez
Joyclyn Yee-Greenbaum
JR Cole
K Schauer
Kenneth Nealson
KP Williams
L Steinder
M Bauer
M Hess
M Schattenhofer
M Wu
Mark Novotny
Mary-Jane Lombardo
ME Webb
MS Schwalbach
O Beja
O Emanuelsson
P Larsson
R Alexander Richter
R Garcia-Contreras
R Seshadri
RC Edgar
RD Finn
RM Morris
RM Morris
Robert Friedman
Roger S Lasken
RR Malmstrom
RS Lasken
Ruben Valas
S Blanvillian
S Yooseph
SF Nishino
Shibu Yooseph
SJ Giovannoni
T Davidsen
T Thomas
T Woyke
TB Britschgi
TD Mullins
TM Schmidt
Y Moriya
Y Shi
Publication venue: Nature Publishing Group
Publication date
Field of study

Bacteria in the 16S rRNA clade SAR86 are among the most abundant uncultivated constituents of microbial assemblages in the surface ocean for which little genomic information is currently available. Bioinformatic techniques were used to assemble two nearly complete genomes from marine metagenomes and single-cell sequencing provided two more partial genomes. Recruitment of metagenomic data shows that these SAR86 genomes substantially increase our knowledge of non-photosynthetic bacteria in the surface ocean. Phylogenomic analyses establish SAR86 as a basal and divergent lineage of γ-proteobacteria, and the individual genomes display a temperature-dependent distribution. Modestly sized at 1.25–1.7 Mbp, the SAR86 genomes lack several pathways for amino-acid and vitamin synthesis as well as sulfate reduction, trends commonly observed in other abundant marine microbes. SAR86 appears to be an aerobic chemoheterotroph with the potential for proteorhodopsin-based ATP generation, though the apparent lack of a retinal biosynthesis pathway may require it to scavenge exogenously-derived pigments to utilize proteorhodopsin. The genomes contain an expanded capacity for the degradation of lipids and carbohydrates acquired using a wealth of tonB-dependent outer membrane receptors. Like the abundant planktonic marine bacterial clade SAR11, SAR86 exhibits metabolic streamlining, but also a distinct carbon compound specialization, possibly avoiding competition

Crossref

PubMed Central