Search CORE

12 research outputs found

DIALIGN-TX and multiple protein alignment using secondary structure information at GOBICS

Author: A. R. Subramanian
B. Morgenstern
Brudno
Do
E. Corel
Edgar
Edgar
Edgar
Feng
Heringa
Lenhof
Montgomerie
Morgenstern
Morgenstern
Morgenstern
P. Meinicke
Pohler
R. Steinkamp
S. Hiran
Subramanian
Subramanian
Taylor
Thompson
Wong
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

We introduce web interfaces for two recent extensions of the multiple-alignment program DIALIGN. DIALIGN-TX combines the greedy heuristic previously used in DIALIGN with a more traditional ‘progressive’ approach for improved performance on locally and globally related sequence sets. In addition, we offer a version of DIALIGN that uses predicted protein secondary structures together with primary sequence information to construct multiple protein alignments. Both programs are available through ‘Göttingen Bioinformatics Compute Server’ (GOBICS)

CiteSeerX

Crossref

PubMed Central

Segment-based multiple sequence alignment

Author: Emde A.-K.
Notredame C.
Rausch T.
Reinert K.
Weese D.
Publication venue
Publication date: 01/01/2008
Field of study

Motivation: Many multiple sequence alignment tools have been developed in the past, progressing either in speed or alignment accuracy. Given the importance and wide-spread use of alignment tools, progress in both categories is a contribution to the community and has driven research in the field so far. Results: We introduce a graph-based extension to the consistency-based, progressive alignment strategy. We apply the consistency notion to segments instead of single characters. The main problem we solve in this context is to define segments of the sequences in such a way that a graph-based alignment is possible. We implemented the algorithm using the SeqAn library and report results on amino acid and DNA sequences. The benefit of our approach is threefold: (1) sequences with conserved blocks can be rapidly aligned, (2) the implementation is conceptually easy, generic and fast and (3) the consistency idea can be extended to align multiple genomic sequences. Availability: The segment-based multiple sequence alignment tool can be downloaded from http://www.seqan.de/projects/msa.html. A novel version of T-Coffee interfaced with the tool is available from http://www.tcoffee.org. The usage of the tool is described in both documentations. Contact: [email protected]

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment

Author: Kaufmann Michael
Morgenstern Burkhard
Subramanian Amarendran R
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background DIALIGN-T is a reimplementation of the multiple-alignment program DIALIGN. Due to several algorithmic improvements, it produces significantly better alignments on locally and globally related sequence sets than previous versions of DIALIGN. However, like the original implementation of the program, DIALIGN-T uses a a straight-forward greedy approach to assemble multiple alignments from local pairwise sequence similarities. Such greedy approaches may be vulnerable to spurious random similarities and can therefore lead to suboptimal results. In this paper, we present DIALIGN-TX, a substantial improvement of DIALIGN-T that combines our previous greedy algorithm with a progressive alignment approach. Results Our new heuristic produces significantly better alignments, especially on globally related sequences, without increasing the CPU time and memory consumption exceedingly. The new method is based on a guide tree; to detect possible spurious sequence similarities, it employs a vertex-cover approximation on a conflict graph. We performed benchmarking tests on a large set of nucleic acid and protein sequences For protein benchmarks we used the benchmark database BALIBASE 3 and an updated release of the database IRMBASE 2 for assessing the quality on globally and locally related sequences, respectively. For alignment of nucleic acid sequences, we used BRAliBase II for global alignment and a newly developed database of locally related sequences called <it>DIRM-BASE 1</it>. IRMBASE 2 and DIRMBASE 1 are constructed by implanting highly conserved motives at random positions in long unalignable sequences. Conclusion On BALIBASE3, our new program performs significantly better than the previous program DIALIGN-T and outperforms the popular global aligner CLUSTAL W, though it is still outperformed by programs that focus on global alignment like MAFFT, MUSCLE and T-COFFEE. On the locally related test sets in IRMBASE 2 and DIRM-BASE 1, our method outperforms all other programs while MAFFT E-INSi is the only method that comes close to the performance of DIALIGN-TX.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

SinicView: A visualization environment for comparisons of multiple nucleotide sequence alignment tools

Author: Chen Shiang-Heng
Chou Meng-Yuan
Hsieh Mu-Fen
Lee DT
Lin Laurent
Peng Chin-Lin
Shiao Tze-Chang
Shih Arthur Chun-Chieh
Wong Chun-Yi
Wu Yu-Wei
Publication venue: BioMed Central
Publication date: 01/03/2006
Field of study

BACKGROUND: Deluged by the rate and complexity of completed genomic sequences, the need to align longer sequences becomes more urgent, and many more tools have thus been developed. In the initial stage of genomic sequence analysis, a biologist is usually faced with the questions of how to choose the best tool to align sequences of interest and how to analyze and visualize the alignment results, and then with the question of whether poorly aligned regions produced by the tool are indeed not homologous or are just results due to inappropriate alignment tools or scoring systems used. Although several systematic evaluations of multiple sequence alignment (MSA) programs have been proposed, they may not provide a standard-bearer for most biologists because those poorly aligned regions in these evaluations are never discussed. Thus, a tool that allows cross comparison of the alignment results obtained by different tools simultaneously could help a biologist evaluate their correctness and accuracy. RESULTS: In this paper, we present a versatile alignment visualization system, called SinicView, (for Sequence-aligning INnovative and Interactive Comparison VIEWer), which allows the user to efficiently compare and evaluate assorted nucleotide alignment results obtained by different tools. SinicView calculates similarity of the alignment outputs under a fixed window using the sum-of-pairs method and provides scoring profiles of each set of aligned sequences. The user can visually compare alignment results either in graphic scoring profiles or in plain text format of the aligned nucleotides along with the annotations information. We illustrate the capabilities of our visualization system by comparing alignment results obtained by MLAGAN, MAVID, and MULTIZ, respectively. CONCLUSION: With SinicView, users can use their own data sequences to compare various alignment tools or scoring systems and select the most suitable one to perform alignment in the initial stage of sequence analysis

Springer - Publisher Connector

Directory of Open Access Journals

National Chung Hsing University Institutional Repository

PubMed Central

An exact solution for the segment-to-segment multiple sequence alignment problem

Author: Lenhof H.-P.
Morgenstern B.
Reinert K.
Publication venue
Publication date: 01/01/1999
Field of study

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

MPG.PuRe

An exact solution for the segment-to-segment multiple sequence alignment problem

Author: B Morgenstern
H. Lenhof
K Reinert
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Integrated multiple sequence alignment

Author: Sammeth Michael
Publication venue: Bielefeld University
Publication date: 01/01/2005
Field of study

Sammeth M. Integrated multiple sequence alignment. Bielefeld (Germany): Bielefeld University; 2005.The thesis presents enhancements for automated and manual multiple sequence alignment: existing alignment algorithms are made more easily accessible and new algorithms are designed for difficult cases. Firstly, we introduce the QAlign framework, a graphical user interface for multiple sequence alignment. It comprises several state-of-the-art algorithms and supports their parameters by convenient dialogs. An alignment viewer with guided editing functionality can also highlight or print regions of the alignment. Also phylogenetic features are provided, e.g., distance-based tree reconstruction methods, corrections for multiple substitutions and a tree viewer. The modular concept and the platform-independent implementation guarantee an easy extensibility. Further, we develop a constrained version of the divide-and-conquer alignment such that it can be restricted by anchors found earlier with local alignments. It can be shown that this method shares attributes of both, local and global aligners, in the quality of results as well as in the computation time. We further modify the local alignment step to work on bipartite (or even multipartite) sets for sequences where repeats overshadow valuable sequence information. In the end a technique is established that can accurately align sequences containing eventually repeated motifs. Finally, another algorithm is presented that allows to compare tandem repeat sequences by aligning them with respect to their possible repeat histories. We describe an evolutionary model including tandem duplications and excisions, and give an exact algorithm to compare two sequences under this model

Publications at Bielefeld University

Integration of genomic data to study genome evolution in plants

Author: Proost Sebastian
Publication venue: Ghent University. Faculty of Sciences
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography

A polyhedral approach to sequence alignment problems

Author: Reinert Knut
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 23/09/2004
Field of study

We study two problems in sequence alignment both from a theoretical and a practical point of view. For the first time in sequence alignment, we use tools from combinatorial optimization to develop branch-and-cut algorithms that solve these problems efficiently. The Generalized Maximum Trace formulation captures several forms of multiple sequence alignment problems in a common framework, among them is the original formulation of Maximum Trace. The Structural Maximum Trace Problem captures the comparison of RNA molecules on the basis of their primary sequence and their secondary structure. For both problems we derive a characterization in terms of graphs which we use to reformulate the problems in terms of integer linear programs. We then study the polytopes (or convex hulls of all feasible solutions)associated with the integer linear program for both problems. For each polytope we derive several classes of facet-defining inequalities and show that for some of these classes the corresponding separation problem can be solved in polynomial time. Thisleads to a polynomial time algorithm for pairwise sequence alignment that is not based on dynamic programming. Moreover, for multiple sequences the branch-and-cut algorithms for both sequence alignment problems are able to solve to optimality instances that are beyond the range of present dynamic programming approaches.Wir betrachten zwei Sequenz-Alignment-Probleme von einem theoretischen und praktischen Standpunkt aus. Dabei nutzen wir Methoden der kombinatorischen Optimierung, um Branch-and-Cut-Algorithmen zu entwickeln, die diese Probleme effizient lösen. Das sogenannte Generalized-Maximum-Trace-Problem beinhaltet verschiedene Arten von multiplen Sequenz-Alignment in einem einheitlichen Rahmen, darunter auch das ursprüngliche Maximum-Trace-Problem. Das sogenannte Structural-Maximum- Trace-Problem beschreibt den Vergleich von RNA-Molekülen, basierend auf deren Primär- und Sekundärstruktur. Wir leiten für beide Probleme eine graphentheoretische Formulierung her, welche wir dann zur Definition ganzzahliger linearer Programme benutzen. Wir untersuchen die Polytope (d.h. die konvexen Hüllen aller zulässigen Lösungen), die mit den ganzzahligen, linearen Programmen assoziiert sind. Für jedes Polytop leiten wir mehrere Klassen facettendefinierender Ungleichungen her und zeigen, daß für einige dieser Klassen das entsprechende Separationsproblem in Polynomialzeit gelöst werden kann. Dies impliziert unter anderem einen Polynomialzeitalgorithmus zum paarweisen Sequenzvergleich, welcher nicht auf dem Prinzip der dynamischen Programmierung beruht. Darüber hinaus sind die vorgestellten Branch-and- Cut-Algorithmen in der Lage, Probleminstanzen einer Größe optimal zu lösen, die mit Verfahren, welche auf dynamischer Programmierung beruhen, nicht gelöst werden könne

Universaar

Acronym

A polyhedral approach to sequence alignment problems

Author: Reinert Knut
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/1999
Field of study

CiteSeerX

MPG.PuRe