Search CORE

13 research outputs found

Inverse parametric sequence alignment

Author: Sun Fangting
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2002
Field of study

We consider the inverse parametric sequence alignment problem, where a sequence alignment is given and the task is to determine parameter values such that the given alignment is optimal at that parameter setting. We describe a O(mn logn)-time algorithm for inverse global alignment without gap penalties and a O(mn login) time algorithm for global alignment with gap penalties, where m, n (n[Less than or equal to symbol]m) are the lengths of input strings. Finally, we discuss the local alignment problem and future work

Digital Repository @ Iowa State University (ISU)

Viterbi Sequences and Polytopes

Author: Kuo Eric H.
Publication venue: 'Elsevier BV'
Publication date: 13/01/2005
Field of study

A Viterbi path of length n of a discrete Markov chain is a sequence of n+1 states that has the greatest probability of ocurring in the Markov chain. We divide the space of all Markov chains into Viterbi regions in which two Markov chains are in the same region if they have the same set of Viterbi paths. The Viterbi paths of regions of positive measure are called Viterbi sequences. Our main results are (1) each Viterbi sequence can be divided into a prefix, periodic interior, and suffix, and (2) as n increases to infinity (and the number of states remains fixed), the number of Viterbi regions remains bounded. The Viterbi regions correspond to the vertices of a Newton polytope of a polynomial whose terms are the probabilities of sequences of length n. We characterize Viterbi sequences and polytopes for two- and three-state Markov chains.Comment: 15 pages, 2 figures, to appear in Journal of Symbolic Computatio

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Parametric Inference for Biological Sequence Analysis

Author: B. Sturmfels
Gardiner-Garden
Gusfield
L. Pachter
Lander
Waterman
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2004
Field of study

One of the major successes in computational biology has been the unification, using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied towards these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems associated with different statistical models. This paper introduces the \emph{polytope propagation algorithm} for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.Comment: 15 pages, 4 figures. See also companion paper "Tropical Geometry of Statistical Models" (q-bio.QM/0311009

arXiv.org e-Print Archive

Parametric Alignment of Drosophila Genomes

Author: Dewey Colin
Huggins Peter
Pachter Lior
Sturmfels Bernd
Woods Kevin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2005
Field of study

The classic algorithms of Needleman--Wunsch and Smith--Waterman find a maximum a posteriori probability alignment for a pair hidden Markov model (PHMM). In order to process large genomes that have undergone complex genome rearrangements, almost all existing whole genome alignment methods apply fast heuristics to divide genomes into small pieces which are suitable for Needleman--Wunsch alignment. In these alignment methods, it is standard practice to fix the parameters and to produce a single alignment for subsequent analysis by biologists. Our main result is the construction of a whole genome parametric alignment of Drosophila melanogaster and Drosophila pseudoobscura. Parametric alignment resolves the issue of robustness to changes in parameters by finding all optimal alignments for all possible parameters in a PHMM. Our alignment draws on existing heuristics for dividing whole genomes into small pieces for alignment, and it relies on advances we have made in computing convex polytopes that allow us to parametrically align non-coding regions using biologically realistic models. We demonstrate the utility of our parametric alignment for biological inference by showing that cis-regulatory elements are more conserved between Drosophila melanogaster and Drosophila pseudoobscura than previously thought. We also show how whole genome parametric alignment can be used to quantitatively assess the dependence of branch length estimates on alignment parameters. The alignment polytopes, software, and supplementary material can be downloaded at http://bio.math.berkeley.edu/parametric/.Comment: 19 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Caltech Authors

Parameters for accurate genome alignment

Author: A Morgulis
A Morgulis
A Schwartz
A Stark
B Paten
CH Yuh
CN Dewey
D Gusfield
D Karolchik
D States
DA Pollard
E Kim
EH Margulies
F Chiaromonte
G Benson
G Lunter
G Lunter
I Holmes
J Ruan
J Wang
JC Wootton
JE Janecka
JO Kriegs
JT Reese
KD Pruitt
KM Wong
LA Newberg
LE Carvalho
M Brudno
M Hamada
Martin C Frith
MC Frith
Michiaki Hamada
MS Waterman
Paul Horton
PP Gardner
R Durbin
RC Friedman
RK Bradley
S Karlin
S Kumar
S Miyazawa
S Schwartz
S Sheetlin
SF Altschul
SF Altschul
TJ Treangen
W Huang
WJ Kent
WJ Kent
YK Yu
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Genome sequence alignments form the basis of much research. Genome alignment depends on various mundane but critical choices, such as how to mask repeats and which score parameters to use. Surprisingly, there has been no large-scale assessment of these choices using real genomic data. Moreover, rigorous procedures to control the rate of spurious alignment have not been employed. Results We have assessed 495 combinations of score parameters for alignment of animal, plant, and fungal genomes. As our gold-standard of accuracy, we used genome alignments implied by multiple alignments of proteins and of structural RNAs. We found the HOXD scoring schemes underlying alignments in the UCSC genome database to be far from optimal, and suggest better parameters. Higher values of the X-drop parameter are not always better. E-values accurately indicate the rate of spurious alignment, but only if tandem repeats are masked in a non-standard way. Finally, we show that γ-centroid (probabilistic) alignment can find highly reliable subsets of aligned bases. Conclusions These results enable more accurate genome alignment, with reliability measures for local alignments and for individual aligned bases. This study was made possible by our new software, LAST, which can align vertebrate genomes in a few hours <url>http://last.cbrc.jp/</url>.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Simultaneous Identification of Duplications, Losses, and Lateral Gene Transfers

Author: Fei Deng
Lusheng Wang
Zhi-Zhong Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Inverse Parametric Sequence Alignment

Author: A. Apostolico
B. Grunbaum
D. Gusfield
D. Gusfield
D. Gusfield
M. S. Waterman
M. S. Waterman
M. Vingron
N. Megiddo
N. Megiddo
W. Fitch
Publication venue
Publication date: 01/01/2002
Field of study

We consider the inverse parametric sequence alignment problem, where a sequence alignment is given and the task is to determine parameter values such that the given alignment is optimal at that parameter setting. We describe a O(rnn log m)-time algorithm for inverse global alignment without gaps and a O(rnn log 2 rn) time algorithm for global alignment with gaps, where rn, ( _ rn) are the lengths of input strings. We then discuss algorithms for local alignment

CiteSeerX

Crossref

Inverse parametric sequence alignment

Author: Sun Fangting
Publication venue
Publication date: 01/01/2002
Field of study

Digital Repository @ Iowa State University (ISU)