Search CORE

82 research outputs found

Automated simultaneous analysis phylogenetics (ASAP) : an enabling tool for phlyogenomics

Author: A Rokas
Ernest K Lee
Gloria Coruzzi
Indra Neil Sarkar
J Gatesy
J Gatesy
J Gatesy
JC Chiu
JE de la Torre
JS Farris
JS Farris
K Nixon
Mary G Egan
PJ Planet
PJ Planet
RC Edgar
Rob DeSalle
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

© 2008 Sarkar et al. This is an Open Access article distributed under the terms of the Creative Commons Attribution License 2.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The definitive version was published in BMC Bioinformatics 9 (2008): 103, doi:10.1186/1471-2105-9-103.The availability of sequences from whole genomes to reconstruct the tree of life has the potential to enable the development of phylogenomic hypotheses in ways that have not been before possible. A significant bottleneck in the analysis of genomic-scale views of the tree of life is the time required for manual curation of genomic data into multi-gene phylogenetic matrices. To keep pace with the exponentially growing volume of molecular data in the genomic era, we have developed an automated technique, ASAP (Automated Simultaneous Analysis Phylogenetics), to assemble these multigene/multi species matrices and to evaluate the significance of individual genes within the context of a given phylogenetic hypothesis. Applications of ASAP may enable scientists to re-evaluate species relationships and to develop new phylogenomic hypotheses based on genome-scale data.This work is funded in part by NSF DBI-0421604 to GC and RD. INS is supported in part by the Ellison Medical Foundation

Crossref

Woods Hole Open Access Server

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Crystal Structure of Legionella DotD: Insights into the Relationship between Type IVB and Type II/III Secretion Systems

The Dot/Icm type IVB secretion system (T4BSS) is a pivotal determinant of Legionella pneumophila pathogenesis. L. pneumophila translocate more than 100 effector proteins into host cytoplasm using Dot/Icm T4BSS, modulating host cellular functions to establish a replicative niche within host cells. The T4BSS core complex spanning the inner and outer membranes is thought to be made up of at least five proteins: DotC, DotD, DotF, DotG and DotH. DotH is the outer membrane protein; its targeting depends on lipoproteins DotC and DotD. However, the core complex structure and assembly mechanism are still unknown. Here, we report the crystal structure of DotD at 2.0 Å resolution. The structure of DotD is distinct from that of VirB7, the outer membrane lipoprotein of the type IVA secretion system. In contrast, the C-terminal domain of DotD is remarkably similar to the N-terminal subdomain of secretins, the integral outer membrane proteins that form substrate conduits for the type II and the type III secretion systems (T2SS and T3SS). A short β-segment in the otherwise disordered N-terminal region, located on the hydrophobic cleft of the C-terminal domain, is essential for outer membrane targeting of DotH and Dot/Icm T4BSS core complex formation. These findings uncover an intriguing link between T4BSS and T2SS/T3SS

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Single-Nucleotide Polymorphism Genotyping Identifies a Locally Endemic Clone of Methicillin-Resistant Staphylococcus aureus

Author: Andreas Nitsche
Anonymous
B Cookson
B Rubinovitch
B Strommenger
Birgit Strommenger
C Marshall
E Klein
Franziska Layer
G Morelli
H Grundmann
IV Kutyavin
JC Lucet
KE Holt
L Koreen
LK McDougal
LM Schouls
M Dulon
MC Enright
ME de Kraker
ML Metzker
P Neuzil
Paul J. Planet
PJ Dennesen
R Köck
RJ Willems
RL Thompson
RN Gunson
RT Okinaka
S Murchan
SR Harris
U Nübel
U Nübel
U Nübel
Ulrich Nübel
W Witte
Wolfgang Witte
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

We developed, tested, and applied a TaqMan real-time PCR assay for interrogation of three single-nucleotide polymorphisms that differentiate a clade (termed ‘t003-X’) within the radiation of methicillin-resistant Staphylococcus aureus (MRSA) ST225. The TaqMan assay achieved 98% typeability and results were fully concordant with DNA sequencing. By applying this assay to 305 ST225 isolates from an international collection, we demonstrate that clade t003-X is endemic in a single acute-care hospital in Germany at least since 2006, where it has caused a substantial proportion of infections. The strain was also detected in another hospital located 16 kilometers away. Strikingly, however, clade t003-X was not found in 62 other hospitals throughout Germany nor among isolates from other countries, and, hence, displayed a very restricted geographical distribution. Consequently, our results show that SNP-typing may be useful to identify and track MRSA clones that are specific to individual healthcare institutions. In contrast, the spatial dissemination pattern observed here had not been resolved by other typing procedures, including multilocus sequence typing (MLST), spa typing, DNA macrorestriction, and multilocus variable-number tandem repeat analysis (MLVA)

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Publikationsserver des Robert Koch-Instituts

The Evolution of the Major Hepatitis C Genotypes Correlates with Clinical Response to Interferon Therapy

Author: A Mangia
A Stamatakis
A Zekry
AB Hill
AL Cox
AR El-Zayadi
AR Lloyd
AS Muerhoff
B Gao
B Kolaczkowski
BT Grenfell
C Combet
C Gaudy
C Kuiken
D Agosti
DB Smith
DB Smith
DJ Zwickl
DJ Zwickl
DL Swofford
DM Hillis
DR Taylor
DS Sikes
E Jaeckel
EC Holmes
EC Verna
EH Sklan
F Dal Pero
F Hasan
F McOmish
FV Chisari
FZ Alfaleh
G Magiorkinis
G Szabo
H Shimodaira
H Shimodaira
J Felsenstein
J Felsenstein
J Timm
JD Thompson
Jeffrey S. Glenn
JJ Feld
JJ Feld
JQ Han
JQ Han
K Katoh
KC Nixon
L Rubbia-Brandt
M Derbala
M Dimitrova
M Miyamoto
M Salemi
M Sarasin-Filipowicz
M von Wagner
M von Wagner
M Gale Jr.
MG Katze
MG Rumi
MJ Gale Jr
ML Shiffman
ML Shiffman
ML Shiffman
ML Yu
MP Manns
MW Fried
N Antaki
N Coppola
N Enomoto
N Kato
N Ogata
NA Cannon
O Dalgard
OG Pybus
OG Pybus
OG Pybus
P Farci
P Farci
P Munoz de Rueda
P Simmonds
P Simmonds
P Simmonds
P Simmonds
Paul J. Planet
PH Harvey
Phillip S. Pang
PJ Planet
PJ Planet
R Aurora
R Bartenschlager
R DeSalle
S Chevaliez
S Zeuzem
SC Ray
Sheila Mary Bowyer
SI Khakoo
SJ Hadziyannis
SL Fishman
SL Pond
SM Kamal
SM Kamal
SM Kamal
SM Kamal
T Berg
T Eriksson
T Kuntzen
T Noguchi
WM Fitch
WP Hofmann
WP Maddison
XS He
Z Chen
Publication venue: Public Library of Science
Publication date: 01/08/2009
Field of study

Patients chronically infected with hepatitis C virus (HCV) require significantly different durations of therapy and achieve substantially different sustained virologic response rates to interferon-based therapies, depending on the HCV genotype with which they are infected. There currently exists no systematic framework that explains these genotype-specific response rates. Since humans are the only known natural hosts for HCV-a virus that is at least hundreds of years old-one possibility is that over the time frame of this relationship, HCV accumulated adaptive mutations that confer increasing resistance to the human immune system. Given that interferon therapy functions by triggering an immune response, we hypothesized that clinical response rates are a reflection of viral evolutionary adaptations to the immune system.We have performed the first phylogenetic analysis to include all available full-length HCV genomic sequences (n = 345). This resulted in a new cladogram of HCV. This tree establishes for the first time the relative evolutionary ages of the major HCV genotypes. The outcome data from prospective clinical trials that studied interferon and ribavirin therapy was then mapped onto this new tree. This mapping revealed a correlation between genotype-specific responses to therapy and respective genotype age. This correlation allows us to predict that genotypes 5 and 6, for which there currently are no published prospective trials, will likely have intermediate response rates, similar to genotype 3. Ancestral protein sequence reconstruction was also performed, which identified the HCV proteins E2 and NS5A as potential determinants of genotype-specific clinical outcome. Biochemical studies have independently identified these same two proteins as having genotype-specific abilities to inhibit the innate immune factor double-stranded RNA-dependent protein kinase (PKR).An evolutionary analysis of all available HCV genomes supports the hypothesis that immune selection was a significant driving force in the divergence of the major HCV genotypes and that viral factors that acquired the ability to inhibit the immune response may play a role in determining genotype-specific response rates to interferon therapy

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Global Considerations in Hierarchical Clustering Reveal Meaningful Patterns in Data

Author: A Torrente
AK Jain
CF Zorumski
D Boley
D Horn
D Horn
David Horn
G Getz
G Owsianik
H Chipman
J Handl
J Orlowski
JB Kruskal
Ji Zhu
LK Kaczmarek
M Berridge
M Rune
M Steinbach
MB Eisen
Michal Linial
MS Savaresi
N Kaplan
N Slonim
O Alter
O Sasson
P Cimiano
P D'Haeseleer
P Hansen
PJ Planet
Q Ren
R Apweiler
R Cangelosi
R Sharan
R Varshavsky
R Varshavsky
RO Duda
Roy Varshavsky
S Altschul
TK Landauer
TR Golub
Y Benjamini
Y Zhao
Publication venue: Public Library of Science
Publication date: 21/05/2008
Field of study

BACKGROUND: A hierarchy, characterized by tree-like relationships, is a natural method of organizing data in various domains. When considering an unsupervised machine learning routine, such as clustering, a bottom-up hierarchical (BU, agglomerative) algorithm is used as a default and is often the only method applied. METHODOLOGY/PRINCIPAL FINDINGS: We show that hierarchical clustering that involve global considerations, such as top-down (TD, divisive), or glocal (global-local) algorithms are better suited to reveal meaningful patterns in the data. This is demonstrated, by testing the correspondence between the results of several algorithms (TD, glocal and BU) and the correct annotations provided by experts. The correspondence was tested in multiple domains including gene expression experiments, stock trade records and functional protein families. The performance of each of the algorithms is evaluated by statistical criteria that are assigned to clusters (nodes of the hierarchy tree) based on expert-labeled data. Whereas TD algorithms perform better on global patterns, BU algorithms perform well and are advantageous when finer granularity of the data is sought. In addition, a novel TD algorithm that is based on genuine density of the data points is presented and is shown to outperform other divisive and agglomerative methods. Application of the algorithm to more than 500 protein sequences belonging to ion-channels illustrates the potential of the method for inferring overlooked functional annotations. ClustTree, a graphical Matlab toolbox for applying various hierarchical clustering algorithms and testing their quality is made available. CONCLUSIONS: Although currently rarely used, global approaches, in particular, TD or glocal algorithms, should be considered in the exploratory process of clustering. In general, applying unsupervised clustering methods can leverage the quality of manually-created mapping of proteins families. As demonstrated, it can also provide insights in erroneous and missed annotations

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

An Integrated Approach for Finding Overlooked Genes in Shigella

Author: A Huttenhofer
A Kumar
A Matsuyama
A Sittka
A Toledo-Arana
AG Oglesby
AL Delcher
AR Gruber
AV Lukashin
B Tjaden
BE Suzek
C Mathe
C Pichon
C Wei
C Ye
CL Kingsford
CP Ponting
E Rivas
EC Hobbs
ER Murphy
F Yang
G Padalon-Brauch
G Padalon-Brauch
G Storz
GM Pupo
GM Pupo
H He
H Nie
HC Tsui
HF Oliver
HY Huang
J Livny
J Livny
J Livny
J Peng
J Peng
J Wei
J Yang
Jian Yang
JM Liu
JM Silvaggi
JP Kastenmayer
Junping Peng
K Ohtani
KB Arnvig
KM Wassarman
L Argaman
L David
L Wang
LS Waters
M Giangrossi
ML Bernardini
MR Hemm
N Majdalani
N Perez
NA Grieshaber
P Dam
P Mandin
P Romby
Paul J. Planet
PJ Sansonetti
PJ Wilderman
PP Gardner
Q Jin
Qi Jin
R Kumar
R Sorek
RJ Carter
S Chen
S Saito
S Washietl
T Akama
T Shimizu
T Song
V Pfeiffer
X Wang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: The completion of numerous genome sequences introduced an era of whole-genome study. However, many genes are missed during genome annotation, including small RNAs (sRNAs) and small open reading frames (sORFs). In order to improve genome annotation, we aimed to identify novel sRNAs and sORFs in Shigella, the principal etiologic agents of bacillary dysentery. Methodology/Principal Findings: We identified 64 sRNAs in Shigella, which were experimentally validated in other bacteria based on sequence conservation. We employed computer-based and tiling array-based methods to search for sRNAs, followed by RT-PCR and northern blots, to identify nine sRNAs in Shigella flexneri strain 301 (Sf301) and 256 regions containing possible sRNA genes. We found 29 candidate sORFs using bioinformatic prediction, array hybridization and RT-PCR verification. We experimentally validated 557 (57.9%) DOOR operon predictions in the chromosomes of Sf301 and 46 (76.7%) in virulence plasmid.We found 40 additional co-expressed gene pairs that were not predicted by DOOR. Conclusions/Significance: We provide an updated and comprehensive annotation of the Shigella genome. Our study increased the expected numbers of sORFs and sRNAs, which will impact on future functional genomics and proteomics studies. Our method can be used for large scale reannotation of sRNAs and sORFs in any microbe with a known genom

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

A Differentiation-Based Phylogeny of Cancer Subtypes

Histopathological classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. In this paper, we introduce a novel computational algorithm to rank tumor subtypes according to the dissimilarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia, breast cancer and liposarcoma subtypes and then apply it to a broader group of sarcomas. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Identification of the Pangenome and Its Components in 14 Distinct Aggregatibacter actinomycetemcomitans Strains by Comparative Genomic Analysis

Author: AJ van Winkelhoff
AL Delcher
AT Maurelli
B Dogan
C Chen
C Chen
C Chen
Casey Chen
D Haubek
D Haubek
D Haubek
D Haubek
DH Fine
DH Fine
G Yue
GS Slater
H Tettelin
J Felsenstein
J Hacker
J Hacker
J Hacker
J Mena
J Slots
JA Eisen
JB Kaplan
JB Kaplan
JD Rudney
JG Lawrence
JM Brogan
JM DiRienzo
K Poulsen
KP Mintz
M Kilian
M Margulies
MA Larkin
MP Di Bonaventura
N Suzuki
O Fujise
PJ Planet
RA Welch
RL Tatusov
Roger E. Bumgarner
S Asikainen
S Asikainen
S Asikainen
S Asikainen
S Doungudomdacha
S Karlin
Sarah K. Highlander
SC Kachlany
Sirkka Asikainen
TM Lowe
VJ Thomson
W Kittichotirat
WA van der Reijden
Weerayuth Kittichotirat
YT Teng
YT Teng
Publication venue: Public Library of Science
Publication date: 19/07/2011
Field of study

Aggregatibacter actinomycetemcomitans is genetically heterogeneous and comprises distinct clonal lineages that may have different virulence potentials. However, limited information of the strain-to-strain genomic variations is available.The genome sequences of 11 A. actinomycetemcomitans strains (serotypes a-f) were generated de novo, annotated and combined with three previously sequenced genomes (serotypes a-c) for comparative genomic analysis. Two major groups were identified; serotypes a, d, e, and f, and serotypes b and c. A serotype e strain was found to be distinct from both groups. The size of the pangenome was 3,301 genes, which included 2,034 core genes and 1,267 flexible genes. The number of core genes is estimated to stabilize at 2,060, while the size of the pangenome is estimated to increase by 16 genes with every additional strain sequenced in the future. Within each strain 16.7-29.4% of the genome belonged to the flexible gene pool. Between any two strains 0.4-19.5% of the genomes were different. The genomic differences were occasionally greater for strains of the same serotypes than strains of different serotypes. Furthermore, 171 genomic islands were identified. Cumulatively, 777 strain-specific genes were found on these islands and represented 61% of the flexible gene pool.Substantial genomic differences were detected among A. actinomycetemcomitans strains. Genomic islands account for more than half of the flexible genes. The phenotype and virulence of A. actinomycetemcomitans may not be defined by any single strain. Moreover, the genomic variation within each clonal lineage of A. actinomycetemcomitans (as defined by serotype grouping) may be greater than between clonal lineages. The large genomic data set in this study will be useful to further examine the molecular basis of variable virulence among A. actinomycetemcomitans strains

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Diversity of 16S-23S rDNA Internal Transcribed Spacer (ITS) Reveals Phylogenetic Relationships in Burkholderia pseudomallei and Its Near-Neighbors

Length polymorphisms within the 16S-23S ribosomal DNA internal transcribed spacer (ITS) have been described as stable genetic markers for studying bacterial phylogenetics. In this study, we used these genetic markers to investigate phylogenetic relationships in Burkholderia pseudomallei and its near-relative species. B. pseudomallei is known as one of the most genetically recombined bacterial species. In silico analysis of multiple B. pseudomallei genomes revealed approximately four homologous rRNA operons and ITS length polymorphisms therein. We characterized ITS distribution using PCR and analyzed via a high-throughput capillary electrophoresis in 1,191 B. pseudomallei strains. Three major ITS types were identified, two of which were commonly found in most B. pseudomallei strains from the endemic areas, whereas the third one was significantly correlated with worldwide sporadic strains. Interestingly, mixtures of the two common ITS types were observed within the same strains, and at a greater incidence in Thailand than Australia suggesting that genetic recombination causes the ITS variation within species, with greater recombination frequency in Thailand. In addition, the B. mallei ITS type was common to B. pseudomallei, providing further support that B. mallei is a clone of B. pseudomallei. Other B. pseudomallei near-neighbors possessed unique and monomorphic ITS types. Our data shed light on evolutionary patterns of B. pseudomallei and its near relative species

Public Library of Science (PLOS)

Crossref

OpenKnowledge@NAU

Directory of Open Access Journals

PubMed Central

Charles Darwin University's Institutional Digital Repository

Culture Enriched Molecular Profiling of the Cystic Fibrosis Airway Microbiome

Author: A D'Onofrio
AD Manganiello
B Kopke
C Quince
CD Sibley
CD Sibley
CD Sibley
Christina S. Eshaghurshan
Christopher D. Sibley
CJ Ingham
CR Woese
CW Kaplan
D Raoult
DF Gordon
DH Huson
DJ Ecker
DM Ward
E Stackebrandt
EM Bik
F Bittar
FJ Accurso
G Muyzer
GB Rogers
GB Rogers
GB Rogers
GB Rogers
GW Tyson
H Matsui
Harvey R. Rabin
HJ Flint
J Reeder
JC Venter
JG Caporaso
JJ Lipuma
JJ Qin
JK Harris
JL Leake
JR Cole
JR Cole
JR Leadbetter
JS Suchodolski
JT Staley
K Alain
K Tamura
K Zengler
K Zengler
KH Wilson
L Dethlefsen
M Achtman
M Keller
M Kolak
M Konneke
M Margulies
M Sait
M Watve
Margot E. Grinwis
MB Miller
MD Parkins
Michael D. Parkins
Michael G. Surette
MM Tunney
Monica M. Faria
MS Rappe
NR Pace
O Tu
P Hugenholtz
Paul J. Planet
PH Gilligan
PJ Turnbaugh
PJ Turnbaugh
Q Wang
R Facklam
RD Wolcott
RD Wolcott
RI Amann
Scot E. Dowd
SE Dowd
SE Dowd
SE Dowd
SE Dowd
SE Dowd
SM Finegold
SS Socransky
T Kaeberlein
TR Callaway
Tyler R. Field
TZ DeSantis Jr
UME Schutte
V Klepac-Ceraj
V Kunin
W Li
WE Moore
WEC Moore
WT Liu
Y Huang
Y Kawamura
Publication venue: Public Library of Science
Publication date: 28/07/2011
Field of study

The microbiome of the respiratory tract, including the nasopharyngeal and oropharyngeal microbiota, is a dynamic community of microorganisms that is highly diverse. The cystic fibrosis (CF) airway microbiome refers to the polymicrobial communities present in the lower airways of CF patients. It is comprised of chronic opportunistic pathogens (such as Pseudomonas aeruginosa) and a variety of organisms derived mostly from the normal microbiota of the upper respiratory tract. The complexity of these communities has been inferred primarily from culture independent molecular profiling. As with most microbial communities it is generally assumed that most of the organisms present are not readily cultured. Our culture collection generated using more extensive cultivation approaches, reveals a more complex microbial community than that obtained by conventional CF culture methods. To directly evaluate the cultivability of the airway microbiome, we examined six samples in depth using culture-enriched molecular profiling which combines culture-based methods with the molecular profiling methods of terminal restriction fragment length polymorphisms and 16S rRNA gene sequencing. We demonstrate that combining culture-dependent and culture-independent approaches enhances the sensitivity of either approach alone. Our techniques were able to cultivate 43 of the 48 families detected by deep sequencing; the five families recovered solely by culture-independent approaches were all present at very low abundance (<0.002% total reads). 46% of the molecular signatures detected by culture from the six patients were only identified in an anaerobic environment, suggesting that a large proportion of the cultured airway community is composed of obligate anaerobes. Most significantly, using 20 growth conditions per specimen, half of which included anaerobic cultivation and extended incubation times we demonstrate that the majority of bacteria present can be cultured

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central