Search CORE

7 research outputs found

MergeAlign: improving multiple sequence alignment performance by dynamic reconstruction of consensus multiple sequence alignments

Author: A Loytynoja
A Loytynoja
F Armougom
G Talavera
GP Raghava
I Van Walle
IM Wallace
J Sukumaran
JD Thompson
JD Thompson
JD Thompson
K Bucka-Lassen
K Katoh
MN Price
O Gotoh
Peter W Collingridge
RC Edgar
RC Edgar
RD Finn
S Kawashima
S Kelly
SB Needleman
SF Altschul
Steven Kelly
TF Smith
TH Ogdenw
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Accounting For Alignment Uncertainty in Phylogenomics

Author: A Drummond
A Loytynoja
A Loytynoja
A Stamatakis
AS Schwartz
AS Schwartz
B Morgenstern
BD Redelings
BG Hall
C Dessimoz
C Notredame
CB Do
D Wu
DA Morrison
DJ States
G Landan
G Talavera
I Van Walle
J Castresana
J Felsenstein
J Pei
J Stoye
JA Lake
JD Thompson
JD Thompson
Jonathan A. Eisen
K Bucka-Lassen
K Katoh
K Liu
KM Kjer
KM Wong
M Steel
M Wu
Marco Salemi
Martin Wu
MO Dayhoff
MS Lee
MS Rosenberg
N Bray
O Penn
P Cammarano
P Kuck
R Durbin
RC Edgar
RC Edgar
RK Bradley
S Guindon
S Hartmann
Sourav Chatterji
T Lassmann
T Lassmann
T Pupko
TH Ogden
U Roshan
UW Hwang
WN Grundy
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Uncertainty in multiple sequence alignments has a large impact on phylogenetic analyses. Little has been done to evaluate the quality of individual positions in protein sequence alignments, which directly impact the accuracy of phylogenetic trees. Here we describe ZORRO, a probabilistic masking program that accounts for alignment uncertainty by assigning confidence scores to each alignment position. Using the BALIBASE database and in simulation studies, we demonstrate that masking by ZORRO significantly reduces the alignment uncertainty and improves the tree accuracy

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FigShare

Efficient representation of uncertainty in multiple sequence alignments using directed acyclic graphs

Author: A Dress
A Godzik
A Löytynoja
A Löytynoja
A Novák
A Novák
A Sali
A Siepel
A Tramontano
Adrienn Szabó
AS Schwartz
AS Schwartz
B Dwivedi
B Knudsen
B Larget
B Misof
B Schwikowski
BD Redelings
BD Redelings
BJM Webb
BP Blackburne
C Dessimoz
C Notredame
C Notredame
CB Do
CJ Challis
D Altschuh
D Chivian
D DeBlasio
D Lupyan
D Metzler
D Metzler
D Robinson
DA Morrison
DF Feng
E Levy Karin
G Jordan
G Landan
G Lunter
G Lunter
G Lunter
G Raghava
G Talavera
GA Churchill
GA Lunter
Hall B G
HT Mevissen
I Holmes
I Miklós
I Miklós
IL Dryden
IM Wallace
István Miklós
J Castresana
J Felsenstein
J Gatesy
J Hein
J Kim
J Zhu
JA Lake
JD Thompson
JD Thompson
JL Thorne
JL Thorne
JL Thorne
JL Thorne
Joseph L Herman
Jotun Hein
K Bucka-Lassen
K Liu
K Liu
KM Wong
L Wang
L Yu
LE Carvalho
LS Wang
M Hamada
M Hamada
M Hamada
M Höhl
M Vingron
M Vingron
M Wu
M Zuker
MA Suchard
MJ Wise
MO Dayhoff
MP Simmons
MS Waterman
MSY Lee
O Gotoh
O Penn
O Penn
O Penn
P Ajawatanawong
P Arunapuram
P Collingridge
PJ Green
PJ Green
PP Gardner
R Durbin
R Satija
R Satija
R Schwarzenbacher
RA Cartwright
RC Edgar
RJ Dickson
RJ Dickson
RK Bradley
Rune Lyngsø
S Capella-Gutiérrez
S Karlin
S Miyazawa
S Needleman
S Sinha
Silla-Martínez Capella-Gutiérrez S
SME Sahraeian
TA Hopf
TH Ogden
TL Blundell
U Roshan
V Ahola
W Fletcher
WC Wheeler
Y Liu
Y Ruffieux
Ádám Novák
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background A standard procedure in many areas of bioinformatics is to use a single multiple sequence alignment (MSA) as the basis for various types of analysis. However, downstream results may be highly sensitive to the alignment used, and neglecting the uncertainty in the alignment can lead to significant bias in the resulting inference. In recent years, a number of approaches have been developed for probabilistic sampling of alignments, rather than simply generating a single optimum. However, this type of probabilistic information is currently not widely used in the context of downstream inference, since most existing algorithms are set up to make use of a single alignment. Results In this work we present a framework for representing a set of sampled alignments as a directed acyclic graph (DAG) whose nodes are alignment columns; each path through this DAG then represents a valid alignment. Since the probabilities of individual columns can be estimated from empirical frequencies, this approach enables sample-based estimation of posterior alignment probabilities. Moreover, due to conditional independencies between columns, the graph structure encodes a much larger set of alignments than the original set of sampled MSAs, such that the effective sample size is greatly increased. Conclusions The alignment DAG provides a natural way to represent a distribution in the space of MSAs, and allows for existing algorithms to be efficiently scaled up to operate on large sets of alignments. As an example, we show how this can be used to compute marginal probabilities for tree topologies, averaging over a very large number of MSAs. This framework can also be used to generate a statistically meaningful summary alignment; example applications show that this summary alignment is consistently more accurate than the majority of the alignment samples, leading to improvements in downstream tree inference. Implementations of the methods described in this article are available at http://statalign.github.io/WeaveAlign webcite

Crossref

SZTAKI Publication Repository

Springer - Publisher Connector

PubMed Central

Oxford University Research Archive

Combining many multiple alignments in one improved alignment

Author: Bucka-Lassen K.
Caprani O.
Hein J.
Publication venue
Publication date: 02/08/2017
Field of study

MOTIVATION: The fact that the multiple sequence alignment problem is of high complexity has led to many different heuristic algorithms attempting to find a solution in what would be considered a reasonable amount of computation time and space. Very few of these heuristics produce results that are guaranteed always to lie within a certain distance of an optimal solution (given a measure of quality, e.g. parsimony). Most practical heuristics cannot guarantee this, but nevertheless perform well for certain cases. An alignment, obtained with one of these heuristics and with a bad overall score, is not unusable though, it might contain important information on how substrings should be aligned. This paper presents a method that extracts qualitatively good sub-alignments from a set of multiple alignments and combines these into a new, often improved alignment. The algorithm is implemented as a variant of the traditional dynamic programming technique. RESULTS: An implementation of ComAlign (the algorithm that combines multiple alignments) has been run on several sets of artificially generated sequences and a set of 5S RNA sequences. To assess the quality of the alignments obtained, the results have been compared with the output of MSA 2.1 (Gupta et al., Proceedings of the Sixth Annual Symposium on Combinatorial Pattern Matching, 1995; Kececioglu et al., http://www.techfak.uni-bielefeld. de/bcd/Lectures/kececioglu.html, 1995). In all cases, ComAlign was able to produce a solution with a score comparable to the solution obtained by MSA. The results also show that ComAlign actually does combine parts from different alignments and not just select the best of them. AVAILABILITY: The C source code (a Smalltalk version is being worked on) of ComAlign and the other programs that have been implemented in this context are free and available on WWW (http://www.daimi.au.dk/ √µcaprani). CONTACT: [email protected]; [email protected];[email protected]

RERO DOC Digital Library

Biological Sequence Analysis: Algorithms and Statistical Methods

Author: DF Feng
GD Forney
K Bucka-Lassen
LE Baum
SB Needleman
TF Smith
WJ Bruno
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Statement of the Austrian Society for Geriatrics and Gerontology on assisted suicide in older people

Author: A Clegg
B Chabot
D Grob
D O’Neill
E Bucka-Lassen
E McCormick
G Wolf-Klein
J Stewart
JW Shega
K Ohnsorge
K Rockwood
TL Beauchamp
WP Achterberg
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Multiple Sequence Alignment Based on Profile Alignment of Intermediate Sequences

Author: A. Heger
A.A. Salamov
B. Morgenstern
C. Lee
C. Notredame
C.B. Do
C.B. Do
D. Gusfield
D.T. Jones
E. Bolten
F. Wilcoxon
H. Zhou
I. Walle Van
I.M. Wallace
J. Park
J. Pei
J. Stoye
J.D. Thompson
J.D. Thompson
J.D. Thompson
K. Bucka-Lassen
K. Katoh
K. Mizuguchi
M. Gerstein
M. Margelevičius
M.A. Marti-Renom
O. Gotoh
O. O’Sullivan
R. Durbin
R.C. Edgar
R.C. Edgar
S. Yamada
S.F. Altschul
T. Lassmann
T.F. Smith
U. Roshan
V.A. Simossis
W. Li
W. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref