Search CORE

248 research outputs found

GUIDANCE: a web server for assessing alignment confidence scores

Author: Castresana
D. Graur
E. Privman
G. Landan
Gatesy
Giribet
H. Ashkenazy
Katoh
Landau
Lassmann
Loytynoja
Neil
Nomaguchi
O. Penn
Poirot
Rambaut
Stoye
T. Pupko
Thompson
Wong
Publication venue: Oxford University Press
Publication date
Field of study

Evaluating the accuracy of multiple sequence alignment (MSA) is critical for virtually every comparative sequence analysis that uses an MSA as input. Here we present the GUIDANCE web-server, a user-friendly, open access tool for the identification of unreliable alignment regions. The web-server accepts as input a set of unaligned sequences. The server aligns the sequences and provides a simple graphic visualization of the confidence score of each column, residue and sequence of an alignment, using a color-coding scheme. The method is generic and the user is allowed to choose the alignment algorithm (ClustalW, MAFFT and PRANK are supported) as well as any type of molecular sequences (nucleotide, protein or codon sequences). The server implements two different algorithms for evaluating confidence scores: (i) the heads-or-tails (HoT) method, which measures alignment uncertainty due to co-optimal solutions; (ii) the GUIDANCE method, which measures the robustness of the alignment to guide-tree uncertainty. The server projects the confidence scores onto the MSA and points to columns and sequences that are unreliably aligned. These can be automatically removed in preparation for downstream analyses. GUIDANCE is freely available for use at http://guidance.tau.ac.il

Crossref

PubMed Central

TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations

Author: Abascal
Bininda-Emonds
Castresana
Clamp
Edgar
Federico Abascal
Gilbert
Guindon
Katoh
Loytynoja
Maximilian J. Telford
Moretti
Notredame
Panico
Posada
Rafael Zardoya
Rice
Schuerer
Simmons
Simmons
Suyama
Talavera
Thompson
Townsend
Wernersson
Yang
Publication venue: Oxford University Press
Publication date: 01/05/2010
Field of study

We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk

Crossref

PubMed Central

UCL Discovery

The genome of the protozoan parasite Cystoisospora suis and a reverse vaccinology approach to identify vaccine candidates

Vaccine development targeting protozoan parasites remains challenging, partly due to the complex interactions between these eukaryotes and the host immune system. Reverse vaccinology is a promising approach for direct screening of genome sequence assemblies for new vaccine candidate proteins. Here, we applied this paradigm to Cystoisospora suis, an apicomplexan parasite that causes enteritis and diarrhea in suckling piglets and economic losses in pig production worldwide. Using Next Generation Sequencing we produced an ∼84 Mb sequence assembly for the C. suis genome, making it the first available reference for the genus Cystoisospora. Then, we derived a manually curated annotation of more than 11,000 protein-coding genes and applied the tool Vacceed to identify 1,168 vaccine candidates by screening the predicted C. suis proteome. To refine the set of candidates, we looked at proteins that are highly expressed in merozoites and specific to apicomplexans. The stringent set of candidates included 220 proteins, among which were 152 proteins with unknown function, 17 surface antigens of the SAG and SRS gene families, 12 proteins of the apicomplexan-specific secretory organelles including AMA1, MIC6, MIC13, ROP6, ROP12, ROP27, ROP32 and three proteins related to cell adhesion. Finally, we demonstrated in vitro the immunogenic potential of a C. suis-specific 42 kDa transmembrane protein, which might constitute an attractive candidate for further testing

Crossref

PubMed Central

Evidence for Centromere Drive in the Holocentric Chromosomes of Caenorhabditis

Author: A Loytynoja
A Loytynoja
AF Dernburg
BA Sullivan
C Goday
Conrad A. Nieduszynski
D Vermaak
DC Shakes
DG Albertson
František Zedek
HS Malik
HS Malik
HS Malik
J Dumont
J Monen
J Niedermaier
J Rutkowska
JM Comeron
JM Jiang
K Kiontke
K Tamura
L Fishman
LD Stein
O Penn
O Penn
PB Talbert
PB Talbert
Petr Bureš
RC Chan
RE Baker
RK Dawe
S Henikoff
SA Surzycki
T Haizel
TW Harris
Z Yang
ZK Cheng
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

In monocentric organisms with asymmetric meiosis, the kinetochore proteins, such as CENH3 and CENP-C, evolve adaptively to counterbalance the deleterious effects of centromere drive, which is caused by the expansion of centromeric satellite repeats. The selection regimes that act on CENH3 and CENP-C genes have not been analyzed in organisms with holocentric chromosomes, although holocentrism is speculated to have evolved to suppress centromere drive. We tested both CENH3 and CENP-C for positive selection in several species of the holocentric genus Caenorhabditis using the maximum likelihood approach and sliding-window analysis. Although CENP-C did not show any signs of positive selection, positive selection has been detected in the case of CENH3. These results support the hypothesis that centromere drive occurs in Nematoda, at least in the telokinetic meiosis of Caenorhabditis

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

CO-phylum: An Assembly-Free Phylogenomic Approach for Close Related Organisms

Author: Blanchette
Cannon
Chen
Cole
Dalquen
Darling
Domazet-Loso
Edgar
Elias
Foster
Glenn
Hohl
Hu
Huang
Huiguang Yi
Jun
Li
Li Jin
Loytynoja
Ma
Otu
Peterlongo
Qi
Ratan
Saitou
Snel
Stuart
Touchon
Ulitsky
Wagner
Wang
Wiens
Wong
Zhou
Publication venue: 'Oxford University Press (OUP)'
Publication date: 06/04/2011
Field of study

Phylogenomic approaches developed thus far are either too time-consuming or lack a solid evolutionary basis. Moreover, no phylogenomic approach is capable of constructing a tree directly from unassembled raw sequencing data. A new phylogenomic method, CO-phylum, is developed to alleviate these flaws. CO-phylum can generate a high-resolution and highly accurate tree using complete genome or unassembled sequencing data of close related organisms, in addition, CO-phylum distance is almost linear with p-distance.Comment: 21 pages, 6 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Population gene introgression and high genome plasticity for the zoonotic pathogen Streptococcus agalactiae

Author: Abbott
Abby
Almeida
Baily
Bankevich
Beerli
Beerli
Benjamini
Bertels
Bikard
Bisharat
Bishop
Bohnsack
Borchardt
Brett M Probert
Brochet
Bruen
Brynildsrud
Capella-Gutierrez
Chen
Chen
Cheng
Chiara Crestani
Chopra
Christopher D Town
Conrad
Croucher
Da Cunha
Delannoy
Delannoy
Dogan
Edgar
Enright
Erwin
Fernandez
Ferreira
Flores
Fluegge
Garrett H Springer
Gauthier
Glazko
Glazko
Greig
Guglielmini
Gupta
Hayley B Hassler
Heaps
Holt
Imperi
Inouye
Irina M Velsko
Jafar
Jaskowiak
Jeukens
Johri
Jones
Jones
Jorgensen
Joubrel
Kalimuddin
Kim
Konig
Langdon
Librado
Lin
Lindahl
Liu
Liu
Lopez-Sanchez
Loytynoja
Lyhs
Manning
Manning
Martins
Marttinen
Mather
McArthur
Md Tauqeer Alam
Michael J Stanhope
Morse
Murrell
Page
Pal
Paulina D Pavinski Bitar
Pedersen
Petrovska
Pond
Poyart
Price
Qin
Richards
Richards
Richards
Rosinski-Chupin
Ruth N Zadoks
Sahl
Sahl
Sahl
Scheffer
Schrieber
Seemann
Shannon D Manning
Shapiro
Shepheard
Sheppard
Spoor
Springman
Srivastava
Stamatakis
Stoddard
Sukhnanand
Supek
Tettelin
Tettelin
Tian
van der Mee-Marquet
Verani
Viana
Vincent P Richards
Yu
Zadoks
Zankari
Zerbino
Zhang
Zhu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/11/2019
Field of study

The influence that bacterial adaptation (or niche partitioning) within species has on gene spillover and transmission among bacteria populations occupying different niches is not well understood. Streptococcus agalactiae is an important bacterial pathogen that has a taxonomically diverse host range making it an excellent model system to study these processes. Here we analyze a global set of 901 genome sequences from nine diverse host species to advance our understanding of these processes. Bayesian clustering analysis delineated twelve major populations that closely aligned with niches. Comparative genomics revealed extensive gene gain/loss among populations and a large pan-genome of 9,527 genes, which remained open and was strongly partitioned among niches. As a result, the biochemical characteristics of eleven populations were highly distinctive (significantly enriched). Positive selection was detected and biochemical characteristics of the dispensable genes under selection were enriched in ten populations. Despite the strong gene partitioning, phylogenomics detected gene spillover. In particular, tetracycline resistance (which likely evolved in the human-associated population) from humans to bovine, canines, seals, and fish, demonstrating how a gene selected in one host can ultimately be transmitted into another, and biased transmission from humans to bovines was confirmed with a Bayesian migration analysis. Our findings show high bacterial genome plasticity acting in balance with selection pressure from distinct functional requirements of niches that is associated with an extensive and highly partitioned dispensable genome, likely facilitating continued and expansive adaptation

Crossref

Enlighten

MPG.PuRe

Gene Promoter Evolution Targets the Center of the Human Protein Interaction Network

Author: A Loytynoja
A Loytynoja
A Rada-Iglesias
AG Clark
B Lemos
B Lemos
DA Drummond
DA Drummond
DG Torgerson
DM Krylov
EA Franzosa
EH Davidson
GA Wray
GA Wray
HB Fraser
HB Fraser
I Yanai
J Berglund
J Ronald
JB Plotkin
JB Wolf
JD Bloom
JD Bloom
Jeff Demuth
Jordi Planas
Josep M. Serrat
L Duret
M Ashburner
M Koudritsky
MC King
MD Wilson
MM Hoffman
MY Wolf
P Flicek
P Khaitovich
PD Thomas
PJ Wittkopp
PM Kim
PM Kim
R Haygood
R Haygood
S Draghici
S Durinck
S Kerrien
S Subramanian
SJ Cooper
SJ Cooper
U Brandes
Y Tabach
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Assessing the contribution of promoters and coding sequences to gene evolution is an important step toward discovering the major genetic determinants of human evolution. Many specific examples have revealed the evolutionary importance of cis-regulatory regions. However, the relative contribution of regulatory and coding regions to the evolutionary process and whether systemic factors differentially influence their evolution remains unclear. To address these questions, we carried out an analysis at the genome scale to identify signatures of positive selection in human proximal promoters. Next, we examined whether genes with positively selected promoters (Prom+ genes) show systemic differences with respect to a set of genes with positively selected protein-coding regions (Cod+ genes). We found that the number of genes in each set was not significantly different (8.1% and 8.5%, respectively). Furthermore, a functional analysis showed that, in both cases, positive selection affects almost all biological processes and only a few genes of each group are located in enriched categories, indicating that promoters and coding regions are not evolutionarily specialized with respect to gene function. On the other hand, we show that the topology of the human protein network has a different influence on the molecular evolution of proximal promoters and coding regions. Notably, Prom+ genes have an unexpectedly high centrality when compared with a reference distribution (P = 0.008, for Eigenvalue centrality). Moreover, the frequency of Prom+ genes increases from the periphery to the center of the protein network (P = 0.02, for the logistic regression coefficient). This means that gene centrality does not constrain the evolution of proximal promoters, unlike the case with coding regions, and further indicates that the evolution of proximal promoters is more efficient in the center of the protein network than in the periphery. These results show that proximal promoters have had a systemic contribution to human evolution by increasing the participation of central genes in the evolutionary process

Public Library of Science (PLOS)

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

PubMed Central

RIUVic

Exploiting CpG Hypermutability to Identify Phenotypically Significant Variation Within Human Protein-Coding Genes

Author: Antonarakis
Antonarakis
Arndt
Bulmer
Cheng
Cooper
Cooper
Coulondre
Duncan
Ellegren
Felsenstein
Gavin Huttley
Goldman
Grantham
Hartl
Holm
Hua Ying
Hubbard
Huttley
Johnson
Kellis
Knight
Krawczak
Kumar
Lanave
Lercher
Lindsay
Loytynoja
Matassi
Miller
Misawa
Muse
Pond
Proffitt
Rabinowicz
Radford
Schmidt
Smith
Sommer
Sved
Tornaletti
Wolfe
Wong
Yap
Publication venue: Oxford University Press
Publication date: 24/02/2016
Field of study

The CpG dinucleotide is disproportionately represented in human genetic variation due to the hypermutability of 5-methyl-cytosine (5mC). We exploit this hypermutability and a novel codon substitution model to identify candidate functionally important exonic nucleotides. Population genetic theory suggests that codon positions with high cross-species CpG frequency will derive from stronger purifying selection. Using the phylogeny-based maximum likelihood inference framework, we applied codon substitution models with context-dependent parameters to measure the mutagenic and selective processes affecting CpG dinucleotides within exonic sequence. The suitability of these models was validated on >2,000 protein coding genes from a naturally occurring biological control, four yeast species that do not methylate their DNA. As expected, our analyses of yeast revealed no evidence for an elevated CpG transition rate or for substitution suppression affecting CpG-containing codons. Our analyses of >12,000 protein-coding genes from four primate lineages confirm the systemic influence of 5mC hypermutability on the divergence of these genes. After adjusting for confounding influences of mutation and the properties of the encoded amino acids, we confirmed that CpG-containing codons are under greater purifying selection in primates. Genes with significant evidence of enhanced suppression of nonsynonymous CpG changes were also shown to be significantly enriched in Online Mendelian Inheritance in Man. We developed a method for ranking candidate phenotypically influential CpG positions in human genes. Application of this method indicates that of the ∼1 million exonic CpG dinucleotides within humans, ∼20% are strong candidates for both hypermutability and disease association

Crossref

PubMed Central

The Australian National University

Accounting For Alignment Uncertainty in Phylogenomics

Author: A Drummond
A Loytynoja
A Loytynoja
A Stamatakis
AS Schwartz
AS Schwartz
B Morgenstern
BD Redelings
BG Hall
C Dessimoz
C Notredame
CB Do
D Wu
DA Morrison
DJ States
G Landan
G Talavera
I Van Walle
J Castresana
J Felsenstein
J Pei
J Stoye
JA Lake
JD Thompson
JD Thompson
Jonathan A. Eisen
K Bucka-Lassen
K Katoh
K Liu
KM Kjer
KM Wong
M Steel
M Wu
Marco Salemi
Martin Wu
MO Dayhoff
MS Lee
MS Rosenberg
N Bray
O Penn
P Cammarano
P Kuck
R Durbin
RC Edgar
RC Edgar
RK Bradley
S Guindon
S Hartmann
Sourav Chatterji
T Lassmann
T Lassmann
T Pupko
TH Ogden
U Roshan
UW Hwang
WN Grundy
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Uncertainty in multiple sequence alignments has a large impact on phylogenetic analyses. Little has been done to evaluate the quality of individual positions in protein sequence alignments, which directly impact the accuracy of phylogenetic trees. Here we describe ZORRO, a probabilistic masking program that accounts for alignment uncertainty by assigning confidence scores to each alignment position. Using the BALIBASE database and in simulation studies, we demonstrate that masking by ZORRO significantly reduces the alignment uncertainty and improves the tree accuracy

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FigShare

Alignment-Free Phylogenetic Reconstruction

Author: A. Loytynoja
B.D. Thatte
C. Daskalakis
C. Daskalakis
C. Daskalakis
C. Semple
D. Graur
D. Metzler
D.G. Higgins
E. Mossel
E. Mossel
I. Elias
I. Gronau
I. Miklos
J. Felsenstein
J.L. Thorne
J.L. Thorne
K. Atteson
K. Katoh
K. Liu
K.B. Athreya
K.M. Wong
L. Wang
M. Csurös
M. Csurös
M. Hohl
M.A. Steel
M.A. Steel
M.A. Suchard
M.R. Lacey
P. Buneman
P.L. Erdös
P.L. Erdös
R.C. Edgar
S. Karlin
V. King
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

14th Annual International Conference, RECOMB 2010, Lisbon, Portugal, April 25-28, 2010. ProceedingsWe introduce the first polynomial-time phylogenetic reconstruction algorithm under a model of sequence evolution allowing insertions and deletions (or indels). Given appropriate assumptions, our algorithm requires sequence lengths growing polynomially in the number of leaf taxa. Our techniques are distance-based and largely bypass the problem of multiple alignment

CiteSeerX

DSpace@MIT

Crossref