Search CORE

PicXAA-Web: a web-based platform for non-progressive maximum expected accuracy alignment of multiple biological sequences

Author: B.-J. Yoon
Bauer
Bradley
Do
Edgar
Feng
Katoh
Needleman
Notredame
S. M. E. Sahraeian
S. Schwartz
Smith
Subramanian
Tabei
Thompson
Wang
Will
Wong
Publication venue: Oxford University Press
Publication date
Field of study

In this article, we introduce PicXAA-Web, a web-based platform for accurate probabilistic alignment of multiple biological sequences. The core of PicXAA-Web consists of PicXAA, a multiple protein/DNA sequence alignment algorithm, and PicXAA-R, an extension of PicXAA for structural alignment of RNA sequences. Both PicXAA and PicXAA-R are probabilistic non-progressive alignment algorithms that aim to find the optimal alignment of multiple biological sequences by maximizing the expected accuracy. PicXAA and PicXAA-R greedily build up the alignment from sequence regions with high local similarity, thereby yielding an accurate global alignment that effectively captures local similarities among sequences. PicXAA-Web integrates these two algorithms in a user-friendly web platform for accurate alignment and analysis of multiple protein, DNA and RNA sequences. PicXAA-Web can be freely accessed at http://gsp.tamu.edu/picxaa/

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Segment-based multiple sequence alignment

Author: Emde A.-K.
Notredame C.
Rausch T.
Reinert K.
Weese D.
Publication venue
Publication date: 01/01/2008
Field of study

Motivation: Many multiple sequence alignment tools have been developed in the past, progressing either in speed or alignment accuracy. Given the importance and wide-spread use of alignment tools, progress in both categories is a contribution to the community and has driven research in the field so far. Results: We introduce a graph-based extension to the consistency-based, progressive alignment strategy. We apply the consistency notion to segments instead of single characters. The main problem we solve in this context is to define segments of the sequences in such a way that a graph-based alignment is possible. We implemented the algorithm using the SeqAn library and report results on amino acid and DNA sequences. The benefit of our approach is threefold: (1) sequences with conserved blocks can be rapidly aligned, (2) the implementation is conceptually easy, generic and fast and (3) the consistency idea can be extended to align multiple genomic sequences. Availability: The segment-based multiple sequence alignment tool can be downloaded from http://www.seqan.de/projects/msa.html. A novel version of T-Coffee interfaced with the tool is available from http://www.tcoffee.org. The usage of the tool is described in both documentations. Contact: [email protected]

PnpProbs: A better multiple sequence alignment tool by better handling of guide trees

Author: Lam TW
Ting HF
YE Y
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

published_or_final_versio

Springer - Publisher Connector

HKU Scholars Hub

MinION Analysis and Reference Consortium: Phase 1 data release and analysis

Author: Benedict Paten
Bonnie L. Brown
Camilla L.C. Ip
David A. Eccles
David Buck
Elizabeth M. Batty
Ewan Birney
Hans J. Jansen
Hugh E. Olsen
Jared T. Simpson
John M. Urban
John R. Tyson
Justin O'Grady
Mariateresa de Cesare
Matthew Loose
MinION Analysis and Reference Consortium
Miten Jain
Paolo Piazza
Richard M. Leggett
Rory J. Bowden
Sara Goodwin
Solomon Mwaigwisya
Terrance P. Snutch
Vadim Zalunin
Publication venue: 'F1000 Research Ltd'
Publication date: 01/10/2015
Field of study

The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced. Further analysis of the Phase 1 data presented here, and additional experiments in Phase 2 of E. coli from MARC are already underway to identify ways to improve and enhance MinION performance

Cold Spring Harbor Laboratory Institutional Repository

University of East Anglia digital repository

Accounting For Alignment Uncertainty in Phylogenomics

Author: A Drummond
A Loytynoja
A Loytynoja
A Stamatakis
AS Schwartz
AS Schwartz
B Morgenstern
BD Redelings
BG Hall
C Dessimoz
C Notredame
CB Do
D Wu
DA Morrison
DJ States
G Landan
G Talavera
I Van Walle
J Castresana
J Felsenstein
J Pei
J Stoye
JA Lake
JD Thompson
JD Thompson
Jonathan A. Eisen
K Bucka-Lassen
K Katoh
K Liu
KM Kjer
KM Wong
M Steel
M Wu
Marco Salemi
Martin Wu
MO Dayhoff
MS Lee
MS Rosenberg
N Bray
O Penn
P Cammarano
P Kuck
R Durbin
RC Edgar
RC Edgar
RK Bradley
S Guindon
S Hartmann
Sourav Chatterji
T Lassmann
T Lassmann
T Pupko
TH Ogden
U Roshan
UW Hwang
WN Grundy
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Uncertainty in multiple sequence alignments has a large impact on phylogenetic analyses. Little has been done to evaluate the quality of individual positions in protein sequence alignments, which directly impact the accuracy of phylogenetic trees. Here we describe ZORRO, a probabilistic masking program that accounts for alignment uncertainty by assigning confidence scores to each alignment position. Using the BALIBASE database and in simulation studies, we demonstrate that masking by ZORRO significantly reduces the alignment uncertainty and improves the tree accuracy

CiteSeerX

Public Library of Science (PLOS)

eScholarship - University of California

FigShare

Phylogenetic assessment of alignments reveals neglected tree signal in gaps

Author: Dessimoz Christophe
Gil Manuel
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Tree-based tests of alignment methods enable the evaluation of the effect of gap placement on the inference of phylogenetic relationships

Repository for Publications and Research Data

Springer - Publisher Connector

PicXAA-R: Efficient structural alignment of multiple RNA sequences using a greedy approach

Author: A Wilm
A Wilm
AO Harmanci
AS Schwartz
B Paten
Byung-Jun Yoon
C Do
C Notredame
CB Do
CB Do
CB Do
D Dalli
D Sankoff
DH Mathews
DH Mathews
FF Costa
G Storz
H Kiryu
H Kiryu
I Holmes
IL Hofacker
IL Hofacker
IL Hofacker
J Gorodkin
JH Havgaard
JH Havgaard
JS McCaskill
K Katoh
M Anwar
M Bauer
M Hamada
M Hamada
R Durbin
RD Dowell
RK Bradley
RK Bradley
S Griffiths-Jones
S Lindgreen
S Moretti
S Siebert
S Wang
S Washietl
S Will
Sayed Mohammad Ebrahim Sahraeian
SM Sahraeian
SR Eddy
U Roshan
X Xu
Y Tabei
ZJ Lu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Accurate and efficient structural alignment of non-coding RNAs (ncRNAs) has grasped more and more attentions as recent studies unveiled the significance of ncRNAs in living organisms. While the Sankoff style structural alignment algorithms cannot efficiently serve for multiple sequences, mostly progressive schemes are used to reduce the complexity. However, this idea tends to propagate the early stage errors throughout the entire process, thereby degrading the quality of the final alignment. For multiple protein sequence alignment, we have recently proposed PicXAA which constructs an accurate alignment in a non-progressive fashion. Results Here, we propose PicXAA-R as an extension to PicXAA for greedy structural alignment of ncRNAs. PicXAA-R efficiently grasps both folding information within each sequence and local similarities between sequences. It uses a set of probabilistic consistency transformations to improve the posterior base-pairing and base alignment probabilities using the information of all sequences in the alignment. Using a graph-based scheme, we greedily build up the structural alignment from sequence regions with high base-pairing and base alignment probabilities. Conclusions Several experiments on datasets with different characteristics confirm that PicXAA-R is one of the fastest algorithms for structural alignment of multiple RNAs and it consistently yields accurate alignment results, especially for datasets with locally similar sequences. PicXAA-R source code is freely available at: <url>http://www.ece.tamu.edu/~bjyoon/picxaa/</url>.</p

Texas A&M Repository

Meta-Alignment with Crumble and Prune: Partitioning very large alignment problems for performance and parallelization

Author: A Siepel
A Siepel
AS Schwartz
B Paten
B Paten
B Rhead
Benedict Paten
C Lee
CN Dewey
David Haussler
DF Feng
G Myers
I Lumb
J Ma
JE Stajich
JS Pedersen
K Katoh
K Katoh
K Kryukov
K Liu
K Reinert
KM Roskin
Krishna M Roskin
M Blanchette
M Hasegawa
M Waterman
N Bray
P Di Tommaso
RC Edgar
RK Bradley
S Griffiths-Jones
S Schwartz
T Kim
U Tönges
W Gentzsch
WJ Kent
WJ Kent
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Continuing research into the global multiple sequence alignment problem has resulted in more sophisticated and principled alignment methods. Unfortunately these new algorithms often require large amounts of time and memory to run, making it nearly impossible to run these algorithms on large datasets. As a solution, we present two general methods, Crumble and Prune, for breaking a phylogenetic alignment problem into smaller, more tractable sub-problems. We call Crumble and Prune <it>meta-alignment </it>methods because they use existing alignment algorithms and can be used with many current alignment programs. Crumble breaks long alignment problems into shorter sub-problems. Prune divides the phylogenetic tree into a collection of smaller trees to reduce the number of sequences in each alignment problem. These methods are orthogonal: they can be applied together to provide better scaling in terms of sequence length and in sequence depth. Both methods partition the problem such that many of the sub-problems can be solved independently. The results are then combined to form a solution to the full alignment problem. Results Crumble and Prune each provide a significant performance improvement with little loss of accuracy. In some cases, a gain in accuracy was observed. Crumble and Prune were tested on real and simulated data. Furthermore, we have implemented a system called Job-tree that allows hierarchical sub-problems to be solved in parallel on a compute cluster, significantly shortening the run-time. Conclusions These methods enabled us to solve gigabase alignment problems. These methods could enable a new generation of biologically realistic alignment algorithms to be applied to real world, large scale alignment problems.</p

Springer - Publisher Connector