Search CORE

147 research outputs found

A strain-variable bacteriocin in Bacillus anthracis and Bacillus cereus with repeated Cys-Xaa-Xaa motifs

Author: Haft Daniel H
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Bacteriocins are peptide antibiotics from ribosomally translated precursors, produced by bacteria often through extensive post-translational modification. Minimal sequence conservation, short gene lengths, and low complexity sequence can hinder bacteriocin identification, even during gene calling, so they are often discovered by proximity to accessory genes encoding maturation, immunity, and export functions. This work reports a new subfamily of putative thiazole-containing heterocyclic bacteriocins. It appears universal in all strains of Bacillus anthracis and B. cereus, but has gone unrecognized because it is always encoded far from its maturation protein operon. Patterns of insertions and deletions among twenty-four variants suggest a repeating functional unit of Cys-Xaa-Xaa

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Orphan SelD proteins and selenium-dependent molybdenum hydroxylases

Author: Haft Daniel H
Self William T
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Bacterial and Archaeal cells use selenium structurally in selenouridine-modified tRNAs, in proteins translated with selenocysteine, and in the selenium-dependent molybdenum hydroxylases (SDMH). The first two uses both require the selenophosphate synthetase gene, selD. Examining over 500 complete prokaryotic genomes finds selD in exactly two species lacking both the selenocysteine and selenouridine systems, Enterococcus faecalis and Haloarcula marismortui. Surrounding these orphan selD genes, forming bidirectional best hits between species, and detectable by Partial Phylogenetic Profiling vs. selD, are several candidate molybdenum hydroxylase subunits and accessory proteins. We propose that certain accessory proteins, and orphan selD itself, are markers through which new selenium-dependent molybdenum hydroxylases can be found

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function

Author: Haft Daniel H
Rusch Douglas B
Selengut Jeremy D
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

PubMed Central

Cell Contact–Dependent Outer Membrane Exchange in Myxobacteria: Genetic Determinants and Mechanism

Author: Bucuvalas Alex
Gerloff Dietlind L.
Haft Daniel H.
Pathak Darshankumar T.
Wall Daniel
Wei Xueming
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Biofilms are dense microbial communities. Although widely distributed and medically important, how biofilm cells interact with one another is poorly understood. Recently, we described a novel process whereby myxobacterial biofilm cells exchange their outer membrane (OM) lipoproteins. For the first time we report here the identification of two host proteins, TraAB, required for transfer. These proteins are predicted to localize in the cell envelope; and TraA encodes a distant PA14 lectin-like domain, a cysteine-rich tandem repeat region, and a putative C-terminal protein sorting tag named MYXO-CTERM, while TraB encodes an OmpA-like domain. Importantly, TraAB are required in donors and recipients, suggesting bidirectional transfer. By use of a lipophilic fluorescent dye, we also discovered that OM lipids are exchanged. Similar to lipoproteins, dye transfer requires TraAB function, gliding motility and a structured biofilm. Importantly, OM exchange was found to regulate swarming and development behaviors, suggesting a new role in cell–cell communication. A working model proposes TraA is a cell surface receptor that mediates cell–cell adhesion for OM fusion, in which lipoproteins/lipids are transferred by lateral diffusion. We further hypothesize that cell contact–dependent exchange helps myxobacteria to coordinate their social behaviors

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

A Guild of 45 CRISPR-Associated (Cas) Protein Families and Multiple CRISPR/Cas Subtypes Exist in Prokaryotic Genomes

Author: Daniel H Haft
Emmanuel F Mongodin
Jeremy Selengut
Jonathan A Eisen
Karen E Nelson
Publication venue: Public Library of Science
Publication date: 01/11/2005
Field of study

Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21–37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer “immunity” against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes

Author: Davidsen Tanja
Ganapathy Anurhada
Gwinn-Giglio Michelle
Haft Daniel H.
Nelson William C.
Richter Alexander R.
Selengut Jeremy D.
White Owen
Publication venue: Oxford University Press
Publication date: 06/12/2006
Field of study

TIGRFAMs is a collection of protein family definitions built to aid in high-throughput annotation of specific protein functions. Each family is based on a hidden Markov model (HMM), where both cutoff scores and membership in the seed alignment are chosen so that the HMMs can classify numerous proteins according to their specific molecular functions. Most TIGRFAMs models describe ‘equivalog’ families, where both orthology and lateral gene transfer may be part of the evolutionary history, but where a single molecular function has been conserved. The Genome Properties system contains a queriable set of metabolic reconstructions, genome metrics and extractions of information from the scientific literature. Its genome-by-genome assertions of whether or not specific structures, pathways or systems are present provide high-level conceptual descriptions of genomic content. These assertions enable comparative genomics, provide a meaningful biological context to aid in manual annotation, support assignments of Gene Ontology (GO) biological process terms and help validate HMM-based predictions of protein function. The Genome Properties system is particularly useful as a generator of phylogenetic profiles, through which new protein family functions may be discovered. The TIGRFAMs and Genome Properties systems can be accessed at and

CiteSeerX

Crossref

PubMed Central

Bioinformatic evidence for a widely distributed, ribosomally produced electron carrier precursor, its maturation proteins, and its nicotinoprotein redox partners

Author: A Benjdia
A Benjdia
A Bernal
A Norin
Daniel H Haft
DH Haft
DH Haft
DH Haft
HJ Sofia
J Dischinger
JD Selengut
JD Selengut
JJ Meulenberg
JK Yang
JM Kuchenreuther
K Mavromatis
KE Kawulka
M Ibrahim
M Lotierzo
M Perzl
MC Taylor
MJ van der Werf
PR Kensche
PW Van Ophem
R Overbeek
RC Edgar
SF Altschul
SR Piersma
SR Wecksler
SW Lee
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Enzymes in the radical SAM (rSAM) domain family serve in a wide variety of biological processes, including RNA modification, enzyme activation, bacteriocin core peptide maturation, and cofactor biosynthesis. Evolutionary pressures and relationships to other cellular constituents impose recognizable grammars on each class of rSAM-containing system, shaping patterns in results obtained through various comparative genomics analyses. Results An uncharacterized gene cluster found in many Actinobacteria and sporadically in Firmicutes, Chloroflexi, Deltaproteobacteria, and one Archaeal plasmid contains a PqqE-like rSAM protein family that includes Rv0693 from <it>Mycobacterium tuberculosis</it>. Members occur clustered with a strikingly well-conserved small polypeptide we designate "mycofactocin," similar in size to bacteriocins and PqqA, precursor of pyrroloquinoline quinone (PQQ). Partial Phylogenetic Profiling (PPP) based on the distribution of these markers identifies the mycofactocin cluster, but also a second tier of high-scoring proteins. This tier, strikingly, is filled with up to thirty-one members per genome from three variant subfamilies that occur, one each, in three unrelated classes of nicotinoproteins. The pattern suggests these variant enzymes require not only NAD(P), but also the novel gene cluster. Further study was conducted using SIMBAL, a PPP-like tool, to search these nicotinoproteins for subsequences best correlated across multiple genomes to the presence of mycofactocin. For both the short chain dehydrogenase/reductase (SDR) and iron-containing dehydrogenase families, aligning SIMBAL's top-scoring sequences to homologous solved crystal structures shows signals centered over NAD(P)-binding sites rather than over substrate-binding or active site residues. Previous studies on some of these proteins have revealed a non-exchangeable NAD cofactor, such that enzymatic activity <it>in vitro </it>requires an artificial electron acceptor such as N,N-dimethyl-4-nitrosoaniline (NDMA) for the enzyme to cycle. Conclusions Taken together, these findings suggest that the mycofactocin precursor is modified by the Rv0693 family rSAM protein and other enzymes in its cluster. It becomes an electron carrier molecule that serves <it>in vivo </it>as NDMA and other artificial electron acceptors do <it>in vitro</it>. Subclasses from three different nicotinoprotein families show "only-if" relationships to mycofactocin because they require its presence. This framework suggests a segregated redox pool in which mycofactocin mediates communication among enzymes with non-exchangeable cofactors.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Exopolysaccharide-associated protein sorting in environmental organisms: the PEP-CTERM/EpsH system. Application of a novel phylogenetic profiling heuristic

Author: A Bateman
A Krogh
A Ruzin
AP Pugsley
AR Panchenko
AS Juncker
B Thony
BC Berks
CL Santini
D Comfort
DA Rodionov
Daniel H Haft
DH Haft
DH Haft
DH Haft
DH Haft
DJ Studholme
DJ Studholme
DT Jones
EL Sonnhammer
EM Marcotte
G von Heijne
GE Crooks
H Andersson
H Ogata
H Tettelin
H Ton-That
H Wu
I Uchiyama
Ian T Paulsen
J Drummelsmith
J Pei
JC Venter
JD Bendtsen
JD Peterson
JD Thompson
Jeremy D Selengut
JF Heidelberg
K Tharakaraman
LA Marraffini
LJ McGuffin
M Buck
M Pellegrini
MC Frith
Naomi Ward
RC Edgar
RD Finn
RL Tatusov
SF Altschul
SG Lee
SG Tringe
SP Iyer
T Bae
T Yoshida
TS Mikkelsen
WW Navarre
Y Zong
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Protein translocation to the proper cellular destination may be guided by various classes of sorting signals recognizable in the primary sequence. Detection in some genomes, but not others, may reveal sorting system components by comparison of the phylogenetic profile of the class of sorting signal to that of various protein families. RESULTS: We describe a short C-terminal homology domain, sporadically distributed in bacteria, with several key characteristics of protein sorting signals. The domain includes a near-invariant motif Pro-Glu-Pro (PEP). This possible recognition or processing site is followed by a predicted transmembrane helix and a cluster rich in basic amino acids. We designate this domain PEP-CTERM. It tends to occur multiple times in a genome if it occurs at all, with a median count of eight instances; Verrucomicrobium spinosum has sixty-five. PEP-CTERM-containing proteins generally contain an N-terminal signal peptide and exhibit high diversity and little homology to known proteins. All bacteria with PEP-CTERM have both an outer membrane and exopolysaccharide (EPS) production genes. By a simple heuristic for screening phylogenetic profiles in the absence of pre-formed protein families, we discovered that a homolog of the membrane protein EpsH (exopolysaccharide locus protein H) occurs in a species when PEP-CTERM domains are found. The EpsH family contains invariant residues consistent with a transpeptidase function. Most PEP-CTERM proteins are encoded by single-gene operons preceded by large intergenic regions. In the Proteobacteria, most of these upstream regions share a DNA sequence, a probable cis-regulatory site that contains a sigma-54 binding motif. The phylogenetic profile for this DNA sequence exactly matches that of three proteins: a sigma-54-interacting response regulator (PrsR), a transmembrane histidine kinase (PrsK), and a TPR protein (PrsT). CONCLUSION: These findings are consistent with the hypothesis that PEP-CTERM and EpsH form a protein export sorting system, analogous to the LPXTG/sortase system of Gram-positive bacteria, and correlated to EPS expression. It occurs preferentially in bacteria from sediments, soils, and biofilms. The novel method that led to these findings, partial phylogenetic profiling, requires neither global sequence clustering nor arbitrary similarity cutoffs and appears to be a rapid, effective alternative to other profiling methods

Crossref

Directory of Open Access Journals

PubMed Central

Macquarie University ResearchOnline

Life in Hot Carbon Monoxide: The Complete Genome Sequence of Carboxydothermus hydrogenoformans Z-2901

Author: A Scott Durkin
Daniel H Haft
Frank T Robb
Igor B Zhulin
James F Kolonay
Jonathan A Eisen
Juan M Gonzalez
Kristine M Jones
Lauren M Brinkac
Luke E Ulrich
Luke J Tallon
Martin Wu
Qinghu Ren
Ramana Madupu
Robert J Dodson
Sean C Daugherty
Steven A Sullivan
William C Nelson
Publication venue: Public Library of Science
Publication date: 01/11/2005
Field of study

We report here the sequencing and analysis of the genome of the thermophilic bacterium Carboxydothermus hydrogenoformans Z-2901. This species is a model for studies of hydrogenogens, which are diverse bacteria and archaea that grow anaerobically utilizing carbon monoxide (CO) as their sole carbon source and water as an electron acceptor, producing carbon dioxide and hydrogen as waste products. Organisms that make use of CO do so through carbon monoxide dehydrogenase complexes. Remarkably, analysis of the genome of C. hydrogenoformans reveals the presence of at least five highly differentiated anaerobic carbon monoxide dehydrogenase complexes, which may in part explain how this species is able to grow so much more rapidly on CO than many other species. Analysis of the genome also has provided many general insights into the metabolism of this organism which should make it easier to use it as a source of biologically produced hydrogen gas. One surprising finding is the presence of many genes previously found only in sporulating species in the Firmicutes Phylum. Although this species is also a Firmicutes, it was not known to sporulate previously. Here we show that it does sporulate and because it is missing many of the genes involved in sporulation in other species, this organism may serve as a “minimal” model for sporulation studies. In addition, using phylogenetic profile analysis, we have identified many uncharacterized gene families found in all known sporulating Firmicutes, but not in any non-sporulating bacteria, including a sigma factor not known to be involved in sporulation previously

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FigShare

GlyGly-CTERM and Rhombosortase: A C-Terminal Protein Processing Signal in a Many-to-One Pairing with a Rhomboid Family Intramembrane Serine Protease

Author: A Krogh
AH Gaspar
C Meissner
Daniel H. Haft
DH Haft
DH Haft
F Brossier
GE Crooks
JD Bendtsen
JD Selengut
JD Selengut
JD Thompson
K Hofmann
K Strisovsky
LG Stevenson
M Freeman
M Shoji
M Zettl
Maureen J. Donlin
MJ Pallen
Neha Varghese
O Schneewind
RC Edgar
RD Finn
S Urban
S Urban
S Urban
SH Payne
SJ Callister
SR Eddy
Y Sugano
Z Wu
Publication venue: Public Library of Science
Publication date: 14/12/2011
Field of study

The rhomboid family of serine proteases occurs in all domains of life. Its members contain at least six hydrophobic membrane-spanning helices, with an active site serine located deep within the hydrophobic interior of the plasma membrane. The model member GlpG from Escherichia coli is heavily studied through engineered mutant forms, varied model substrates, and multiple X-ray crystal studies, yet its relationship to endogenous substrates is not well understood. Here we describe an apparent membrane anchoring C-terminal homology domain that appears in numerous genera including Shewanella, Vibrio, Acinetobacter, and Ralstonia, but excluding Escherichia and Haemophilus. Individual genomes encode up to thirteen members, usually homologous to each other only in this C-terminal region. The domain's tripartite architecture consists of motif, transmembrane helix, and cluster of basic residues at the protein C-terminus, as also seen with the LPXTG recognition sequence for sortase A and the PEP-CTERM recognition sequence for exosortase. Partial Phylogenetic Profiling identifies a distinctive rhomboid-like protease subfamily almost perfectly co-distributed with this recognition sequence. This protease subfamily and its putative target domain are hereby renamed rhombosortase and GlyGly-CTERM, respectively. The protease and target are encoded by consecutive genes in most genomes with just a single target, but far apart otherwise. The signature motif of the Rhombo-CTERM domain, often SGGS, only partially resembles known cleavage sites of rhomboid protease family model substrates. Some protein families that have several members with C-terminal GlyGly-CTERM domains also have additional members with LPXTG or PEP-CTERM domains instead, suggesting there may be common themes to the post-translational processing of these proteins by three different membrane protein superfamilies

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central