Search CORE

12 research outputs found

A fast algorithm for the constrained multiple sequence alignment problem

Author: Arslan Abdullah N.
He Dan
Ling Alan C. H.
Publication venue
Publication date: 01/01/2006
Field of study

Given n strings S1, S2, ..., Sn, and a pattern string P, the constrained multiple sequence alignment (CMSA) problem is to find an optimal multiple alignment of S1, S2, ..., Sn such that the alignment contains P, i.e. in the alignment matrix there exists a sequence of columns each entirely composed of symbol P[k] for every k, where P[k] is the kth symbol in P, 1 ≤ k ≤ |P|, and in the sequence, a column containing P[i] appears before the column containing P[j] for all i,j, i < j. The problem is motivated from the problem of comparing multiple sequences that share a common structure, or sequence pattern. There are O(2ns1s2...snr)-time dynamic programming algorithms for the problem, where s1,s2, ...,sn and r are, respectively, the lengths of the input strings and the pattern string. Feasibility of these algorithms in practice is limited when the number of sequences is large, or the sequences are long because of the impractically long time required by these algorithms. We present a new algorithm with worst-case time complexity also O(2ns1s2...snr), but the algorithm avoids redundant computations in existing dynamic programming solutions. Experiments on both randomly generated strings and real data show that this algorithm is much faster than the existing algorithms. We present an analysis that explains the speed-up obtained in our experiments by our algorithm over the naive dynamic programming algorithm for constrained multiple sequence alignment of protein sequences. The speed-up is more significant when pattern is long, or n is large. For example in the case of constrained pairwise sequence alignment (the CMSA problem with n=2) when the pattern is sufficiently long for strings S1 and S2, the asymptotic time complexity is observed to be O(s1s2) instead of O(s1s2r). Main ideas in our algorithm can also be used in other constrained sequence alignment problems

University of Szeged

Constrained Longest Common Subsequence Computing Algorithms in Practice

Author: Deorowicz Sebastian
Obstój Joanna
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 26/01/2012
Field of study

The problem of finding a constrained longest common subsequence (CLCS) for the sequences A and B with respect to sequence P was introduced recently. Its goal is to find a longest subsequence C of A and B such that P is a subsequence of C. There are several algorithms solving the CLCS problem, but there is no real experimental comparison of them. The paper has two aims. Firstly, we propose an improvement to the algorithms by Chin et al. and Deorowicz based on an entry-exit points technique by He and Arslan. Secondly, we compare experimentally the existing algorithms for solving the CLCS problem

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

A Parallel GPU-Designed Algorithm for the Constrained Multiple Sequence Alignment Problem

Author: Adam Gudyś
Sebastian Deorowicz
Publication venue
Publication date: 24/04/2020
Field of study

Abstract. Modern graphical processing units (GPUs) offer much more computational power than modern CPUs, so it is natural that GPUs are often used for solving many computationally-intensive problems. One of the tasks of huge importance in bioinformatics is sequence alignment. We investigate its variant introduced a few years ago in which some additional requirement on the alignment is given. As a result we propose a parallel version of Center-Star algorithm computing the constrained multiple sequence alignment at the GPU. The obtained speedup over the serial CPU relative is in range [20, 200]

CiteSeerX

Acta Cybernetica : Volume 17. Number 4.

Author
Publication venue
Publication date: 01/01/2006
Field of study

University of Szeged

Efficient constrained multiple sequence alignment with performance guarantee

Author: Chin FYL
Ho NL
Lam TW
Wong PWH
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2005
Field of study

The constrained multiple sequence alignment problem is to align a set of sequences of maximum length n subject to a given constrained sequence, which arises from some knowledge of the structure of the sequences. This paper presents new algorithms for this problem, which are more efficient in terms of time and space (memory) than the previous algorithms,15 and with a worst-case guarantee on the quality of the alignment. Saving the space requirement by a quadratic factor is particularly significant as the previous O(n4)-space algorithm has limited application due to its huge memory requirement. Experiments on real data sets confirm that our new algorithms show improvements in both alignment quality and resource requirements. © Imperial College Press.link_to_subscribed_fulltex

HKU Scholars Hub

Efficient constrained multiple sequence alignment with performance guarantee.

Author: Chan MY
Chin FY
Ho NL
Lam TW
Wong PW
Publication venue
Publication date: 01/01/2003
Field of study

The Constrained Multiple Sequence Alignment problem is to align a set of sequences subject to a given constrained sequence, which arises from some knowledge of the structure of the sequences. This paper presents new algorithms for this problem, which are more efficient in terms of time and space (memory) than the previous algorithms [14], and with a worst-case guarantee on the quality of the alignment. Saving the space requirement by a quadratic factor is particularly significant as the previous O(n(4))-space algorithm has limited application due to its huge memory requirement. Experiments on real data sets confirm that our new algorithms show improvements in both alignment quality and resource requirements.link_to_subscribed_fulltex

HKU Scholars Hub

Efficient Constrained Multiple Sequence Alignment with Performance Guarantee

Author: Chin N. L
Francis Y. L
Ho T. W
Lam Prudence
M. Y. Chan
W. H. Wong
Publication venue
Publication date
Field of study

The Constrained Multiple Sequence Alignment problem is to align a set of sequences subject to a given constrained sequence, which arises from some knowledge of the structure of the sequences. This paper presents new algorithms for this problem, which are more efficient in terms of time and space (memory) than the previous algorithms [14], and with a worst-case guarantee on the quality of the alignment. Saving the space requirement by a quadratic factor is particularly significant as the previous O(n 4)-space algorithm has limited application due to its huge memory requirement. Experiments on real data sets confirm that our new algorithms show improvements in both alignment quality and resource requirements. 1

CiteSeerX

EFFICIENT CONSTRAINED MULTIPLE SEQUENCE ALIGNMENT WITH PERFORMANCE GUARANTEE

Author: Clote P.
Cormen T. H.
FRANCIS Y. L. CHIN
Gusfield D.
Jiang T.
N. L. HO
Nicholas H. B.
Notredame C.
PRUDENCE W. H. WONG
T. W. LAM
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref