Search CORE

65,367 research outputs found

Pairwise alignment incorporating dipeptide covariation

Author: Altschul
Altschul
Altschul
Altschul
Bailey
Bishop
Brenner
Cline
Crooks
DOOLITTLE
Frith
Fukami-Kobayashi
G. E. Crooks
Goldman
Gonnet
Henikoff
Henikoff
Jung
Karplus
Lin
Muller
Murzin
Park
Pearson
R. E. Green
RODIONOV
S. E. Brenner
Sander
Smith
Thorne
Thorne
Thorne
Topham
Weiss
Zachariah
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/07/2005
Field of study

Motivation: Standard algorithms for pairwise protein sequence alignment make the simplifying assumption that amino acid substitutions at neighboring sites are uncorrelated. This assumption allows implementation of fast algorithms for pairwise sequence alignment, but it ignores information that could conceivably increase the power of remote homolog detection. We examine the validity of this assumption by constructing extended substitution matrixes that encapsulate the observed correlations between neighboring sites, by developing an efficient and rigorous algorithm for pairwise protein sequence alignment that incorporates these local substitution correlations, and by assessing the ability of this algorithm to detect remote homologies. Results: Our analysis indicates that local correlations between substitutions are not strong on the average. Furthermore, incorporating local substitution correlations into pairwise alignment did not lead to a statistically significant improvement in remote homology detection. Therefore, the standard assumption that individual residues within protein sequences evolve independently of neighboring positions appears to be an efficient and appropriate approximation

arXiv.org e-Print Archive

Crossref

Multiple sequence alignment based on set covers

Author: A. Bahr
B. Manthey
B. Morgenstern
B. Morgenstern
C. Notredame
D. Gusfield
G. Vogt
J.D. Thompson
K. Katoh
O. Gotoh
P. Zhao
R.E. Green
R.F. Smith
S. Henikoff
T. Müller
T.P. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

We introduce a new heuristic for the multiple alignment of a set of sequences. The heuristic is based on a set cover of the residue alphabet of the sequences, and also on the determination of a significant set of blocks comprising subsequences of the sequences to be aligned. These blocks are obtained with the aid of a new data structure, called a suffix-set tree, which is constructed from the input sequences with the guidance of the residue-alphabet set cover and generalizes the well-known suffix tree of the sequence set. We provide performance results on selected BAliBASE amino-acid sequences and compare them with those yielded by some prominent approaches

arXiv.org e-Print Archive

CiteSeerX

Crossref

Optimally fast incremental Manhattan plane embedding and planar tight span construction

Author: Eppstein David
Publication venue
Publication date: 01/01/2011
Field of study

We describe a data structure, a rectangular complex, that can be used to represent hyperconvex metric spaces that have the same topology (although not necessarily the same distance function) as subsets of the plane. We show how to use this data structure to construct the tight span of a metric space given as an n x n distance matrix, when the tight span is homeomorphic to a subset of the plane, in time O(n^2), and to add a single point to a planar tight span in time O(n). As an application of this construction, we show how to test whether a given finite metric space embeds isometrically into the Manhattan plane in time O(n^2), and add a single point to the space and re-test whether it has such an embedding in time O(n).Comment: 39 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Journal of Computational Geometry (JoCG - Carleton University, Computational Geometry Lab)

Higher accuracy protein Multiple Sequence Alignment by Stochastic Algorithm

Author: Alpana Dey
Justin Jose
Krishna Kant
M. S. Jeevitesh
Narayan Behera
Publication venue
Publication date: 03/03/2010
Field of study

Multiple Sequence Alignment gives insight into evolutionary, structural and functional relationships among the proteins. Here, a novel Protein Alignment by Stochastic Algorithm (PASA) is developed. Evolutionary operators of a genetic algorithm, namely, mutation and selection are utilized in combining the output of two most important sequence alignment programs and then developing an optimized new algorithm. Efficiency of protein alignments is evaluated in terms of Total Column score which is equal to the number of correctly aligned columns between a test alignment and the reference alignment divided by the total number of columns in the reference alignment. The PASA optimizer achieves, on an average, significant better alignment over the well known individual bioinformatics tools. This PASA is statistically the most accurate protein alignment method today. It can have potential applications in drug discovery processes in the biotechnology industry

Nature Precedings

The Binary Space Partitioning-Tree Process

Author: Fan Xuhui
Li Bin
Sisson Scott Anthony
Publication venue
Publication date: 20/04/1988
Field of study

The Mondrian process represents an elegant and powerful approach for space partition modelling. However, as it restricts the partitions to be axis-aligned, its modelling flexibility is limited. In this work, we propose a self-consistent Binary Space Partitioning (BSP)-Tree process to generalize the Mondrian process. The BSP-Tree process is an almost surely right continuous Markov jump process that allows uniformly distributed oblique cuts in a two-dimensional convex polygon. The BSP-Tree process can also be extended using a non-uniform probability measure to generate direction differentiated cuts. The process is also self-consistent, maintaining distributional invariance under a restricted subdomain. We use Conditional-Sequential Monte Carlo for inference using the tree structure as the high-dimensional variable. The BSP-Tree process's performance on synthetic data partitioning and relational modelling demonstrates clear inferential improvements over the standard Mondrian process and other related methods

arXiv.org e-Print Archive

Trinity College

Quantum Hamiltonian reduction of W-algebras and category O

Author: Morgan Stephen
Publication venue
Publication date: 01/11/2014
Field of study

W-algebras are a class of non-commutative algebras related to the classical universal enveloping algebras. They can be defined as a subquotient of U(g) related to a choice of nilpotent element e and compatible nilpotent subalgebra m. The definition is a quantum analogue of the classical construction of Hamiltonian reduction. We define a quantum version of Hamiltonian reduction by stages and use it to construct intermediate reductions between different W-algebras U(g,e) in type A.This allows us to express the W-algebra U(g,e') as a subquotient of U(g,e) for nilpotent elements e' covering e. It also produces a collection of (U(g,e),U(g,e'))-bimodules analogous to the generalised Gel'fand-Graev modules used in the classical definition of the W-algebra; these can be used to obtain adjoint functors between the corresponding module categories. The category of modules over a W-algebra has a full subcategory defined in a parallel fashion to that of the Bernstein-Gel'fand-Gel'fand (BGG) category O; this version of category O(e) for W-algebras is equivalent to an infinitesimal block of O by an argument of Mili\v{c}i\'{c} and Soergel. We therefore construct analogues of the translation functors between the different blocks of O, in this case being functors between the categories O(e) for different W-algebras U(g,e). This follows an argument of Losev, and realises the category O(e') as equivalent to a full subcategory of the category O(e) where e' is greater than e in the refinement ordering.Comment: University of Toronto PhD thesis, defended July 2014, 57 page

arXiv.org e-Print Archive

University of Toronto Research Repository

A Two-Phase Dynamic Programming Algorithm Tool for DNA Sequences

Author: Fatumo S.
Oyelade O. J.
Publication venue
Publication date: 01/08/2013
Field of study

Sequence alignment has to do with the arrangement of DNA, RNA, and protein sequences to identify areas of similarity. Technic ally, it involves the arrangement of the primary sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Similarity may be a consequence of functional, s tructural, or evolutionary relationships between the sequences. If two sequences in an alignment share a common ancestor, mismatches can be interpreted as mutations, and gaps as insertions. Such information becomes of great use in vital areas such as the study of d iseases, genomics and generally in the biological sciences. Thus, sequence alignment presents not just an exciting field of study, but a field of great importance to mankind. In this light, we extensively studied about seventy (70) existing sequence alignment tools available to us. Most of these tools are not user friendly and cannot be used by biologists. The few tools that attempted both Local and Global algorithms are not ready available freely. We therefore implemented a sequence alignment tool (CU-Aligner) in an understandable, user-friendly and portable way, with click-of-a-button simplicity. This is done utilizing the Needleman-Wunsh and Smith-Waterman algorithms for global and local alignments, respectively which focuses primarily on DNA sequences. Our aligner is implemented in the Java language in both application and applet mode and has been efficient on all windows operating systems

Covenant University Repository