Search CORE

52 research outputs found

Phylogeny of Prokaryotes and Chloroplasts Revealed by a Simple Composition Approach on All Protein Sequences from Complete Genomes Without Sequence Alignment

Author: C Lemieux
CR Woese
CR Woese
D Sankoff
DH Moreira
E Chatton
E Mayr
E Pennisi
F Tekaia
FitchWM
GI McFadden
GI McFadden
GW Stuart
GW Stuart
J Adachi
J Las Rivas De
J Lin
J Qi
J Qi
J.Q. Deng
JA Eisen
JD Palmer
JR Brown
K.H. Chu
KH Chu
L.Q. Zhou
M Li
M Turmel
M Turmel
MA Ragan
MW Gray
MW Gray
N Saitou
O Weiss
RF Doolittle
RF Doolittle
RL Charlebois
RS Gupta
S.C. Long
ST Fitz-Gibbon
SV Edwards
V.V. Anh
VL Stirewalt
W Martin
W Martin
W Martin
Z.G. Yu
ZG Yu
ZG Yu
ZG Yu
ZG Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The complete genomes of living organisms have provided much information on their phylogenetic relationships. Similarly, the complete genomes of chloroplasts have helped to resolve the evolution of this organelle in photosynthetic eukaryotes. In this paper we propose an alternative method of phylogenetic analysis using compositional statistics for all protein sequences from complete genomes. This new method is conceptually simpler than and computationally as fast as the one proposed by Qi et al. (2004b) and Chu et al. (2004). The same data sets used in Qi et al. (2004b) and Chu et al. (2004) are analyzed using the new method. Our distance-based phylogenic tree of the 109 prokaryotes and eukaryotes agrees with the biologists tree of life based on 16S rRNA comparison in a predominant majority of basic branching and most lower taxa. Our phylogenetic analysis also shows that the chloroplast genomes are separated to two major clades corresponding to chlorophytes s.l. and rhodophytes s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution

Crossref

Queensland University of Technology ePrints Archive

Can Clustal-style progressive pairwise alignment of multiple sequences be used in RNA secondary structure prediction?

Author: Amelia B Bellamy-Royds
B Masoumi
D Gautheret
D Mathews
D Sankoff
DH Mathews
DH Mathews
G Pavesi
G Storz
IL Hofacker
IL Hofacker
J Felsenstein
J Gorodkin
JD Thompson
JH Havgaard
JJ Cannone
KJ Doshi
M Anwar
M Sprinzl
M Zuker
M Zuker
M Zuker
Marcel Turcotte
P Rice
PP Gardner
PP Gardner
R Gutell
RD Dowell
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background In ribonucleic acid (RNA) molecules whose function depends on their final, folded three-dimensional shape (such as those in ribosomes or spliceosome complexes), the secondary structure, defined by the set of internal basepair interactions, is more consistently conserved than the primary structure, defined by the sequence of nucleotides. Results The research presented here investigates the possibility of applying a progressive, pairwise approach to the alignment of multiple RNA sequences by simultaneously predicting an energy-optimized consensus secondary structure. We take an existing algorithm for finding the secondary structure common to two RNA sequences, Dynalign, and alter it to align profiles of multiple sequences. We then explore the relative successes of different approaches to designing the tree that will guide progressive alignments of sequence profiles to create a multiple alignment and prediction of conserved structure. Conclusion We have found that applying a progressive, pairwise approach to the alignment of multiple ribonucleic acid sequences produces highly reliable predictions of conserved basepairs, and we have shown how these predictions can be used as constraints to improve the results of a single-sequence structure prediction algorithm. However, we have also discovered that the amount of detail included in a consensus structure prediction is highly dependent on the order in which sequences are added to the alignment (the guide tree), and that if a consensus structure does not have sufficient detail, it is less likely to provide useful constraints for the single-sequence method.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Directed acyclic graph kernels for structural RNA analysis

Author: B Knudsen
B Schölkopf
CB Do
D Haussler
D Sankoff
DB Searls
DM Tax
E Rivas
EK Freyhult
H Kiryu
H Saigo
I Holmes
IL Hofacker
IL Hofacker
J Hertel
J Hertel
JD Thompson
JS McCaskill
JS Pedersen
JW Brown
K Sato
Kengo Sato
Kiyoshi Asai
MA Rosenblad
P Pacheco
RD Dowell
RE Fan
RJ Klein
S Washietl
S Washietl
S Will
SR Eddy
SR Eddy
SR Eddy
T Babak
T Kin
Toutai Mituyama
W Deng
Y Sakakibara
Y Sakakibara
Y Sakakibara
Yasubumi Sakakibara
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. Results We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. Conclusion Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Evolution through segmental duplications and losses : A Super-Reconciliation approach

Author: A Bergeron
A Deepak
A Tofigh
AA Abbasi
AV Aho
B Moret
B Vernot
C Chauve
C Semple
CM Zmasek
CW Stevens
David Sankoff
DEK Ferrier
E Tannier
G Bourque
G Brightwell
G Fertin
G Pruesse
G Sundstrom
GJ Szöllősi
I Holyer
J Garcia-Fernàndez
J Ma
J Paszek
J Sjöstrand
JD Thompson
JP Doyon
LX Zhang
M Constantinescu
M Goodman
M Hafeez
M Lafond
MP Ng
MS Bansal
MS Bansal
N El-Mabrouk
O Akerborg
R Chaudhary
R Dondi
S Bérard
S Dreborg
S Kumar
TA Larsson
W Ajmal
Y Anselmetti
YC Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/05/2020
Field of study

The classical gene and species tree reconciliation, used to infer the history of gene gain and loss explaining the evolution of gene families, assumes an independent evolution for each family. While this assumption is reasonable for genes that are far apart in the genome, it is not appropriate for genes grouped into syntenic blocks, which are more plausibly the result of a concerted evolution. Here, we introduce the Super-Reconciliation problem which consists in inferring a history of segmental duplication and loss events (involving a set of neighboring genes) leading to a set of present-day syntenies from a single ancestral one. In other words, we extend the traditional Duplication-Loss reconciliation problem of a single gene tree, to a set of trees, accounting for segmental duplications and losses. Existency of a Super-Reconciliation depends on individual gene tree consistency. In addition, ignoring rearrangements implies that existency also depends on gene order consistency. We first show that the problem of reconstructing a most parsimonious Super-Reconciliation, if any, is NP-hard and give an exact exponential-time algorithm to solve it. Alternatively, we show that accounting for rearrangements in the evolutionary model, but still only minimizing segmental duplication and loss events, leads to an exact polynomial-time algorithm. We finally assess time efficiency of the former exponential time algorithm for the Duplication-Loss model on simulated datasets, and give a proof of concept on the opioid receptor genes

Crossref

University of East Anglia digital repository

Diagnostic and prognostic value of neutrophil gelatinase-associated lipocalin, matrix metalloproteinase-9, and tissue inhibitor of matrix metalloproteinases-1 for sepsis in the Emergency Department: an observational study

Author: A DI Carlo
Chunsheng Li
D Fliser
D Katagiri
D M hl
DH Goetz
Guijuan Dong
H Kim
H Nagase
I Vanlaere
J Martensson
JB Cowland
JD Sankoff
K Mori
KL Becker
KM Schmidt-Ott
L Lorente
L Lorente
M Haase
MC Strehlow
MD Howell
Miaomiao Wang
MM Levy
NI Shapiro
P Devarajan
Q Zhang
Qian Zhang
RP Dellinger
S Kibe
SM Bagshaw
SR Maitra
SR Maitra
T Berger
TH Flo
U Hoffman
U Hoffmann
V Barresi
Xin Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

MACSE: Multiple Alignment of Coding SEquences Accounting for Frameshifts and Stop Codons

Author: A Löytynoja
B Chevreux
B Morgenstern
C Notredame
CNS Pedersen
D Huchon
D Przybylski
D Sankoff
D Zheng
DG Higgins
E Dermitzakis
Emmanuel J. P. Douzery
F Abascal
F Delsuc
Frédéric Delsuc
H Philippe
H Zhao
J Hein
J Kececioglu
J Kececioglu
J Raes
JD Thompson
K Katoh
KM Wong
L Arvestad
L Salmela
M Dayhoff
M Gouy
M Kircher
M Margulies
M Suyama
MT Gilbert
N Galtier
OR Bininda-Emonds
P Sneath
PJ Farabaugh
R Wernersson
RC Edgar
RC Edgar
RK Bradley
RR Stocsits
RW Meredith
S Henikoff
S Needleman
SF Altschul
SF Altschul
SS Steiger
Sébastien Harispe
T Smith
TA Demere
TJ Hubbard
TJ Wheeler
V Ranwez
Vincent Ranwez
William J. Murphy
X Guan
X Huang
Y Van de Peer
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Until now the most efficient solution to align nucleotide sequences containing open reading frames was to use indirect procedures that align amino acid translation before reporting the inferred gap positions at the codon level. There are two important pitfalls with this approach. Firstly, any premature stop codon impedes using such a strategy. Secondly, each sequence is translated with the same reading frame from beginning to end, so that the presence of a single additional nucleotide leads to both aberrant translation and alignment

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

Capillary lactate as a tool for the triage nurse among patients with SIRS at emergency department presentation: a preliminary report

Author: A Meregalli
AE Jones
AF Dugas
B Levy
CB Moomey Jr
E Rivers
HB Nguyen
J Levraut
JD Sankoff
L Seoane
L Wolf
LJ Kaplan
M Gatien
MD Howell
NI Shapiro
NI Shapiro
NI Shapiro
RC Bone
RP Dellinger
S Trzeciak
SM Perman
T Berger
TC Jansen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A Lagrangian relaxation approach for the multiple sequence alignment problem

Author: A Caprara
A Delcher
A Lucena
AR Subramanian
C Notredame
D Eppstein
D Gusfield
D Lipman
D Sankoff
E Althaus
E Althaus
Ernst Althaus
H Carrillo
I Elias
J Beasley
JD Thompson
K Katoh
K Mehlhorn
K Reinert
M Fisher
M Garey
M Held
M Lermen
RC Edgar
S Gupta
Stefan Canzar
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Large-Scale Phylogenetic Analysis of Emerging Infectious Diseases

Author: A Moilanen
A Phillips
A Tehler
AR Lemmon
B Budowle
B Chang
B Grenfell
B Rannala
B Rannala
BD Redelings
BE Martina
C Ceron
C Scholtissek
D Earn
D Franz
D Janies
D Janies
D Morrison
D Pol
D Sankoff
D Searls
DJ Zwickl
DL Swofford
DL Swofford
DL Swofford
DM Hillis
DM Hillis
E Ghedin
E Holmes
E Ukkonen
EM Rubin
G Laver
H Song
J Antonovics
J Felsenstein
J Felsenstein
J Felsenstein
J Huelsenbeck
J Plotkin
J Silvertown
J Thornton
JD Thompson
JK Taubenberger
JK Taubenberger
JK Taubenberger
JK Taubenberger
JL Thorne
JP Carulli
JS Farris
JS Farris
JS Farris
K Li
K Li
K Ungchusak
KC Nixon
KC Nixon
KP White
L Wang
L Watrous
LA Salter
LH Taylor
LR Foulds
M Gammelin
M Gibbs
M Koopmans
M Metzker
MA Charleston
MA Marra
MD Hendy
MJ Brauer
N Saitou
NM Ferguson
NM Ferguson
P Palese
PA Goloboff
PA Goloboff
PA Rota
PO Lewis
Q Wang
R Fleissner
RG Webster
RM Bush
RM Bush
RM Bush
RS Ross
S Lau
S Li
S Morse
S Poe
T Fanning
T Grant
T Ksiazek
The Chinese SARS Molecular Epidemiology Consortium
W Hennig
W Li
W Wheeler
W Wheeler
WC Wheeler
WC Wheeler
WM Fitch
WM Fitch
WM Fitch
Y Guan
Y Guan
Y Lin
Y Suzuki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Microorganisms that cause infectious diseases present critical issues of national security, public health, and economic welfare. For example, in recent years, highly pathogenic strains of avian influenza have emerged in Asia, spread through Eastern Europe and threaten to become pandemic. As demonstrated by the coordinated response to Severe Acute Respiratory Syndrome (SARS) and influenza, agents of infectious disease are being addressed via large-scale genomic sequencing. The goal of genomic sequencing projects are to rapidly put large amounts of data in the public domain to accelerate research on disease surveillance, treatment, and prevention. However, our ability to derive information from large comparative genomic datasets lags far behind acquisition. Here we review the computational challenges of comparative genomic analyses, specifically sequence alignment and reconstruction of phylogenetic trees. We present novel analytical results on from two important infectious diseases, Severe Acute Respiratory Syndrome (SARS) and influenza.SARS and influenza have similarities and important differences both as biological and comparative genomic analysis problems. Influenza viruses (Orthymxyoviridae) are RNA based. Current evidence indicates that influenza viruses originate in aquatic birds from wild populations. Influenza has been studied for decades via well-coordinated international efforts. These efforts center on surveillance via antibody characterization of the hemagglutinin (HA) and neuraminidase (N) proteins of the circulating strains to inform vaccine design. However we still do not have a clear understanding of: 1) various transmission pathways such as the role of intermediate hosts such as swine and domestic birds and 2) the key mutation and genomic recombination events that underlie periodic pandemics of influenza. In the past 30 years, sequence data from HA and N loci has become an important data type. In the past year, full genomic data has become prominent. These data present exciting opportunities to address unanswered questions in influenza pandemics.SARS is caused by a previously unrecognized lineage of coronavirus, SARS-CoV, which like influenza has an RNA based genome. Although SARS-CoV is widely believed to have originated in animals there remains disagreement over the candidate animal source that lead to the original outbreak of SARS. In contrast to the long history of the study of influenza, SARS was only recognized in late 2002 and the virus that causes SARS has been documented primarily by genomic sequencing.In the past, most studies of influenza were performed on a limited number of isolates and genes suited to a particular problem. Major goals in science today are to understand emerging diseases in broad geographic, environmental, societal, biological, and genomic contexts. Synthesizing diverse information brought together by various researchers is important to find out what can be done to prevent future outbreaks {JON03}. Thus comprehensive means to organize and analyze large amounts of diverse information are critical. For example, the relationships of isolates and patterns of genomic change observed in large datasets might not be consistent with hypotheses formed on partial data. Moreover when researchers rely on partial datasets, they restrict the range of possible discoveries.Phylogenetics is well suited to the complex task of understanding emerging infectious disease. Phylogenetic analyses can test many hypotheses by comparing diverse isolates collected from various hosts, environments, and points in time and organizing these data into various evolutionary scenarios. The products of a phylogenetic analysis are a graphical tree of ancestor-descendent relationships and an inferred summary of mutations, recombination events, host shifts, geographic, and temporal spread of the viruses. However, this synthesis comes at a price. The cost of computation of phylogenetic analysis expands combinatorially as the number of isolates considered increases. Thus, large datasets like those currently produced are commonly considered intractable. We address this problem with synergistic development of heuristics tree search strategies and parallel computing.Fil: Janies, D.. Ohio State University; Estados UnidosFil: Pol, Diego. Ohio State University; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Tracing the HIV-1 subtype B mobility in Europe: a phylogeographic approach

Author: A Abecasis
A Bobkov
A Bobkov
A Sonnerborg
A Stamatakis
AA Nabatov
AC Palma
AJ Brown
AJ Drummond
AM Wensing
AMJ Wensing
Andrea De Luca
Andrzej Horban
Angelos Hatzakis
Anne-Mieke Vandamme
Annemarie MJ Wensing
BG Brenner
BH Hahn
Birgitta Åsjö
C Casado
Carmen de Mendoza
Charles A Boucher
Claudia Balotta
Claudia Kücherer
Claus Nielsen
Clive Loveday
D Descamps
D Paraskevis
D Paraskevis
D Paraskevis
D Paraskevis
D Sankoff
D Vittecoq
Daniel Struck
David A van de Vijver
Dimitrios Paraskevis
Dominique Costagliola
E Lospitao
EC Holmes
EC Holmes
Eilidh MacRae
EL Op de Coul
EL Op de Coul
Eline LM Op de Coul
Elisabeth Puchhammer-Stöckl
Enzo Boeri
F Gao
F Lewis
Gkikas Magiorkinis
Guiseppe Angarano
H Skar
I Maljkovic-Berry
I Rinsma
IM Hoepelman
Inge Derdelinckx
J Boni
J Snoeck
J Stanczak
Jan Albert
JB Brunet
JD Thompson
Jean-Claude Schmit
JL Heeney
JM Coffin
K Liitsola
K Liitsola
KE Robbins
Klaus Korn
Kristel Van Laethem
Laurence Meyer
Lidia Ruiz
Luc Perrin
M Melbye
M Peeters
M Peeters
M Slatkin
M Violin
Maja Stanojevic
Marie-Laure Chaix
Maurizio Zazzi
Mika O Salminen
ML Chaix
MM Thomson
MP Glauser
MT Gilbert
N Clumeck
Oliver Pybus
Osama Hamouda
P Lemey
PM Sharp
R Machuca
RG Wallace
Ricardo Camacho
RJ Gifford
Rob Schuurman
S Deroo
S Hue
S Hue
Sabine Yerly
Suzie Coughlan
T Leitner
T Nakano
Thomas Leitner
Vidar Ormaasen
Vincent Soriano
VV Lukashov
VV Lukashov
WM Fitch
WP Maddison
WP Maddison
Zehava Grossman
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The prevalence and the origin of HIV-1 subtype B, the most prevalent circulating clade among the long-term residents in Europe, have been studied extensively. However the spatial diffusion of the epidemic from the perspective of the virus has not previously been traced. Results In the current study we inferred the migration history of HIV-1 subtype B by way of a phylogeography of viral sequences sampled from 16 European countries and Israel. Migration events were inferred from viral phylogenies by character reconstruction using parsimony. With regard to the spatial dispersal of the HIV subtype B sequences across viral phylogenies, in most of the countries in Europe the epidemic was introduced by multiple sources and subsequently spread within local networks. Poland provides an exception where most of the infections were the result of a single point introduction. According to the significant migratory pathways, we show that there are considerable differences across Europe. Specifically, Greece, Portugal, Serbia and Spain, provide sources shedding HIV-1; Austria, Belgium and Luxembourg, on the other hand, are migratory targets, while for Denmark, Germany, Italy, Israel, Norway, the Netherlands, Sweden, Switzerland and the UK we inferred significant bidirectional migration. For Poland no significant migratory pathways were inferred. Conclusion Subtype B phylogeographies provide a new insight about the geographical distribution of viral lineages, as well as the significant pathways of virus dispersal across Europe, suggesting that intervention strategies should also address tourists, travellers and migrants.</p