Search CORE

13,005 research outputs found

Pairwise alignment incorporating dipeptide covariation

Author: Altschul
Altschul
Altschul
Altschul
Bailey
Bishop
Brenner
Cline
Crooks
DOOLITTLE
Frith
Fukami-Kobayashi
G. E. Crooks
Goldman
Gonnet
Henikoff
Henikoff
Jung
Karplus
Lin
Muller
Murzin
Park
Pearson
R. E. Green
RODIONOV
S. E. Brenner
Sander
Smith
Thorne
Thorne
Thorne
Topham
Weiss
Zachariah
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/07/2005
Field of study

Motivation: Standard algorithms for pairwise protein sequence alignment make the simplifying assumption that amino acid substitutions at neighboring sites are uncorrelated. This assumption allows implementation of fast algorithms for pairwise sequence alignment, but it ignores information that could conceivably increase the power of remote homolog detection. We examine the validity of this assumption by constructing extended substitution matrixes that encapsulate the observed correlations between neighboring sites, by developing an efficient and rigorous algorithm for pairwise protein sequence alignment that incorporates these local substitution correlations, and by assessing the ability of this algorithm to detect remote homologies. Results: Our analysis indicates that local correlations between substitutions are not strong on the average. Furthermore, incorporating local substitution correlations into pairwise alignment did not lead to a statistically significant improvement in remote homology detection. Therefore, the standard assumption that individual residues within protein sequences evolve independently of neighboring positions appears to be an efficient and appropriate approximation

arXiv.org e-Print Archive

Crossref

Laboratory Bounds on Electron Lorentz Violation

Author: Brett Altschul
F. A. Aharonian
R. Assmann
S. Herrmann
Publication venue: 'American Physical Society (APS)'
Publication date: 17/05/2010
Field of study

Violations of Lorentz boost symmetry in the electron and photon sectors can be constrained by studying several different high-energy phenomenon. Although they may not lead to the strongest bounds numerically, measurements made in terrestrial laboratories produce the most reliable results. Laboratory bounds can be based on observations of synchrotron radiation, as well as the observed absences of vacuum Cerenkov radiation. Using measurements of synchrotron energy losses at LEP and the survival of TeV photons, we place new bounds on the three electron Lorentz violation coefficients c_(TJ), at the 3 x 10^(-13) to 6 x 10^(-15) levels.Comment: 18 page

arXiv.org e-Print Archive

Crossref

Scholar Commons - Institutional Repository of the University of South Carolina

Back-translation for discovering distant protein homologies

Author: A. Pedersen
B. Oostra
C. Kosiol
J. Leluk
J. Leluk
J. Raes
K. Okamura
L. Arvestad
L. Delaye
M. Clamp
M. Pellegrini
P. Harrison
P. Lio
R. Blake
S. Altschul
S. Altschul
S. Altschul
Y. Hahn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Frameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level. To cope with this situation, we propose a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. This allows us to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.Comment: The 9th International Workshop in Algorithms in Bioinformatics (WABI), Philadelphia : \'Etats-Unis d'Am\'erique (2009

arXiv.org e-Print Archive

CiteSeerX

HAL - Lille 3

Crossref

INRIA a CCSD electronic archive server

HAL: Hyper Article en Ligne

Clustering with shallow trees

Author: A Braunstein
A Flaxman
Altschul S F
Bradde S Braunstein A Flaxman A Zecchina R
L Foini
M Bailly-Bechet
R Zecchina
S Bradde
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

We propose a new method for hierarchical clustering based on the optimisation of a cost function over trees of limited depth, and we derive a message--passing method that allows to solve it efficiently. The method and algorithm can be interpreted as a natural interpolation between two well-known approaches, namely single linkage and the recently presented Affinity Propagation. We analyze with this general scheme three biological/medical structured datasets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Genetic Correlations in Mutation Processes

Author: A. S. Lapedes
A. S. Lapedes
B. G. Giraud
E. Ben-Naim
H. E. Stanley
H. Kishino
M. S. Waterman
R. Durbin
R. Levine
S. Altschul
T. E. Harris
W. Li
Publication venue: 'American Physical Society (APS)'
Publication date: 10/12/1998
Field of study

We study the role of phylogenetic trees on correlations in mutation processes. Generally, correlations decay exponentially with the generation number. We find that two distinct regimes of behavior exist. For mutation rates smaller than a critical rate, the underlying tree morphology is almost irrelevant, while mutation rates higher than this critical rate lead to strong tree-dependent correlations. We show analytically that identical critical behavior underlies all multiple point correlations. This behavior generally characterizes branching processes undergoing mutation.Comment: revtex, 8 pages, 2 fig

arXiv.org e-Print Archive

Crossref

Non-local on-shell field redefinition for the SME

Author: B. Altschul
C. Itzykson
D. Mattingly
D. F. Phillips
M. S. Berger
N. E. Mavromatos
Ralf Lehnert
V. A. Kostelecký
V. A. Kostelecký
Publication venue: 'American Physical Society (APS)'
Publication date: 22/09/2006
Field of study

This work instigates a study of non-local field mappings within the Lorentz- and CPT-violating Standard-Model Extension (SME). An example of such a mapping is constructed explicitly, and the conditions for the existence of its inverse are investigated. It is demonstrated that the associated field redefinition can remove b-type Lorentz violation from free SME fermions in certain situations. These results are employed to obtain explicit expressions for the corresponding Lorentz-breaking momentum-space eigenspinors and their orthogonality relations.Comment: 12 pages, REVTeX

arXiv.org e-Print Archive

Crossref

CERN Document Server

Simplified amino acid alphabets based on deviation of conditional probability from random background

Author: A. Godzik
A.G. Murzin
C.E. Schafmeister
D.S. Riddle
Di Liu
H.S. Chan
J. Wang
Ji Qi
K.W. Plaxco
L.R. Murphy
M. Munson
S. Henikoff
S. Miyazawa
S.E. Brenner
S.F. Altschul
S.F. Altschul
Wei-Mou Zheng
Xin Liu
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2002
Field of study

The primitive data for deducing the Miyazawa-Jernigan contact energy or BLOSUM score matrix consists of pair frequency counts. Each amino acid corresponds to a conditional probability distribution. Based on the deviation of such conditional probability from random background, a scheme for reduction of amino acid alphabet is proposed. It is observed that evident discrepancy exists between reduced alphabets obtained from raw data of the Miyazawa-Jernigan's and BLOSUM's residue pair counts. Taking homologous sequence database SCOP40 as a test set, we detect homology with the obtained coarse-grained substitution matrices. It is verified that the reduced alphabets obtained well preserve information contained in the original 20-letter alphabet.Comment: 9 pages,3figure

arXiv.org e-Print Archive

Crossref

CERN Document Server

Bethe Ansatz in the Bernoulli Matching Model of Random Sequence Alignment

Author: A. M. Vershik
D. Gusfield
D. Sankoff
J. M. Hammersley
Kirone Mallick
M. Ablowitz
M. S. Waterman
R. Dubrin
R. J. Baxter
R. Wagner
S. F. Altschul
S. M. Ulam
Satya N. Majumdar
Sergei Nechaev
V. Dancik
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2008
Field of study

For the Bernoulli Matching model of sequence alignment problem we apply the Bethe ansatz technique via an exact mapping to the 5--vertex model on a square lattice. Considering the terrace--like representation of the sequence alignment problem, we reproduce by the Bethe ansatz the results for the averaged length of the Longest Common Subsequence in Bernoulli approximation. In addition, we compute the average number of nucleation centers of the terraces.Comment: 14 pages, 5 figures (some points are clarified

arXiv.org e-Print Archive

Crossref

HAL-CEA

HAL: Hyper Article en Ligne

Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies

Author: A Kraskov
A Milosavljević
G Navarro
J Felsenstein
J Lake
J Rissanen
J Rissanen
J Thompson
J Varre
Konrad Scheffler
L Allison
M Brudno
M Brudno
M Cao
M Li
M Li
M Mahoney
M Nei
M Steel
Maya Paczuski
N Bray
N Bray
N Saitou
Orion Penner
P Buneman
P Lockhart
P Viola
Peter Grassberger
R Cilibrasi
R Durbin
S Altschul
S Altschul
S McGinnis
S Vinga
T Cover
T Lassmann
W Press
X Chen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 19/08/2010
Field of study

Existing sequence alignment algorithms use heuristic scoring schemes which cannot be used as objective distance metrics. Therefore one relies on measures like the p- or log-det distances, or makes explicit, and often simplistic, assumptions about sequence evolution. Information theory provides an alternative, in the form of mutual information (MI) which is, in principle, an objective and model independent similarity measure. MI can be estimated by concatenating and zipping sequences, yielding thereby the "normalized compression distance". So far this has produced promising results, but with uncontrolled errors. We describe a simple approach to get robust estimates of MI from global pairwise alignments. Using standard alignment algorithms, this gives for animal mitochondrial DNA estimates that are strikingly close to estimates obtained from the alignment free methods mentioned above. Our main result uses algorithmic (Kolmogorov) information theory, but we show that similar results can also be obtained from Shannon theory. Due to the fact that it is not additive, normalized compression distance is not an optimal metric for phylogenetics, but we propose a simple modification that overcomes the issue of additivity. We test several versions of our MI based distance measures on a large number of randomly chosen quartets and demonstrate that they all perform better than traditional measures like the Kimura or log-det (resp. paralinear) distances. Even a simplified version based on single letter Shannon entropies, which can be easily incorporated in existing software packages, gave superior results throughout the entire animal kingdom. But we see the main virtue of our approach in a more general way. For example, it can also help to judge the relative merits of different alignment algorithms, by estimating the significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

IMT Institutional Repository

The Francis Crick Institute