Search CORE

7 research outputs found

Summation test for gap penalties and strong law of the local alignment score

Author: Chan Hock Peng
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 12/05/2005
Field of study

A summation test is proposed to determine admissible types of gap penalties for logarithmic growth of the local alignment score. We also define a converging sequence of log moment generating functions that provide the constants associated with the large deviation rate and logarithmic strong law of the local alignment score and the asymptotic number of matches in the optimal local alignment.Comment: Published at http://dx.doi.org/10.1214/105051605000000061 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Comparative Analysis of Cyclic Sequences: Viroids and other Small Circular RNA`s

Author: Hofacker Ivo L.
Mosig Axel
Stadler Peter F.
Publication venue
Publication date: 25/10/2018
Field of study

The analysis of small circular sequences requires specialized tools. While the differences between linear and circular sequences can be neglected in the case of long molecules such as bacterial genomes since in practice all analysis is performed in sequence windows, this is not true for viroids and related sequences which are usually only a few hundred basepairs long. In this contribution we present basic algorithms and corresponding software for circular RNAs. In particular, we discuss the problem of pairwise and multiple cyclic sequence alignments with affine gap costs, and an extension of a recent approach to circular RNA folding to the computation of consensus structures

Qucosa - Publikationsserver der Universität Leipzig

Alignments of mitochondrial genome arrangements: Applications to metazoan phylogeny

Author: Fritzsch Guido
Schlegel Martin
Stadler Peter F.
Publication venue
Publication date: 07/01/2019
Field of study

Mitochondrial genomes provide a valuable dataset for phylogenetic studies, in particular of metazoan phylogeny because of the extensive taxon sample that is available. Beyond the traditional sequence-based analysis it is possible to extract phylogenetic information from the gene order. Here we present a novel approach utilizing these data based on cyclic list alignments of the gene orders. A progressive alignment approach is used to combine pairwise list alignments into a multiple alignment of gene orders. Parsimony methods are used to reconstruct phylogenetic trees, ancestral gene orders, and consensus patterns in a straightforward approach. We apply this method to study the phylogeny of protostomes based exclusively on mitochondrial genome arrangements. We, furthermore, demonstrate that our approach is also applicable to the much larger genomes of chloroplasts

Qucosa - Publikationsserver der Universität Leipzig

Progressive Multiple Sequence Alignments from Triplets

Author: Kruspe Matthias
Stadler Peter F.
Publication venue
Publication date: 14/12/2018
Field of study

Motivation: The quality of progressive sequence alignments strongly depends on the accuracy of the individual pairwise alignment steps since gaps that are introduced at one step cannot be removed at later aggregation steps. Adjacent insertions and deletions necessarily appear in arbitrary order in pairwise alignments and hence form an unavoidable source of errors. Idea: Here we present a modified variant of progressive sequence alignments that addresses both issues. Instead of pairwise alignments we use exact dynamic programming to align sequence or profile triples. This avoids a large fractions of the ambiguities arising in pairwise alignments. In the subsequent aggregation steps we follow the logic of the Neighbor-Net algorithm, which constructs a phylogenetic network by step-wisely replacing triples by pairs instead of combining pairs to singletons. To this end the three-way alignments are subdivided into two partial alignments, at which stage all-gap columns are naturally removed. This alleviates the “once a gap, always a gap” problem of progressive alignment procedures. Results: The three-way Neighbor-Net based alignment program aln3nn is shown to compare favorably on both protein sequences and nucleic acids sequences to other progressive alignment tools. In the latter case one easily can include scoring terms that consider secondary structure features. Overall, the quality of resulting alignments in general exceeds that of clustalw or other multiple alignments tools even though our software does not included heuristics for context dependent (mis)match scores

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Qucosa - Publikationsserver der Universität Leipzig

A Sequence Alignment Algorithm with an Arbitrary Gap Penalty Function

Author: Olsen R.
Saul F.A.
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Unveiling the Molecular Mechanisms Regulating the Activation of the ErbB Family Receptors at Atomic Resolution through Molecular Modeling and Simulations

Author: Shih Andrew
Publication venue: ScholarlyCommons
Publication date: 01/01/2011
Field of study

The EGFR/ErbB/HER family of kinases contains four homologous receptor tyrosine kinases that are important regulatory elements in key signaling pathways. To elucidate the atomistic mechanisms of dimerization-dependent activation in the ErbB family, we have performed molecular dynamics simulations of the intracellular kinase domains of the four members of the ErbB family (those with known kinase activity), namely EGFR, ErbB2 (HER2) and ErbB4 (HER4) as well as ErbB3 (HER3), an assumed pseudokinase, in different molecular contexts: monomer vs. dimer, wildtype vs. mutant. Using bioinformatics and fluctuation analyses of the molecular dynamics trajectories, we relate sequence similarities to correspondence of specific bond-interaction networks and collective dynamical modes. We find that in the active conformation of the ErbB kinases (except ErbB3), key subdomain motions are coordinated through conserved hydrophilic interactions: activating bond-networks consisting of hydrogen bonds and salt bridges. The inactive conformations also demonstrate conserved bonding patterns (albeit less extensive) that sequester key residues and disrupt the activating bond network. Both conformational states have distinct hydrophobic advantages through context-specific hydrophobic interactions. The inactive ErbB3 kinase domain also shows coordinated motions similar to the active conformations, in line with recent evidence that ErbB3 is a weakly active kinase, though the coordination seems to arise from hydrophobic interactions rather than hydrophilic ones. We show that the functional (activating) asymmetric kinase dimer interface forces a corresponding change in the hydrophobic and hydrophilic interactions that characterize the inactivating interaction network, resulting in motion of the αC-helix through allostery. Several of the clinically identified activating kinase mutations of EGFR act in a similar fashion to disrupt the inactivating interaction network. Our molecular dynamics study reveals the asymmetric dimer interface helps progress the ErbB family through the activation pathway using both hydrophilic and hydrophobic interaction. There is a fundamental difference in the sequence of events in EGFR activation compared with that described for the Src kinase Hck

ScholarlyCommons@Penn

Studying Evolutionary Change: Transdisciplinary Advances in Understanding and Measuring Evolution

Author: Retzlaff Nancy
Publication venue
Publication date: 20/04/2020
Field of study

Evolutionary processes can be found in almost any historical, i.e. evolving, system that erroneously copies from the past. Well studied examples do not only originate in evolutionary biology but also in historical linguistics. Yet an approach that would bind together studies of such evolving systems is still elusive. This thesis is an attempt to narrowing down this gap to some extend. An evolving system can be described using characters that identify their changing features. While the problem of a proper choice of characters is beyond the scope of this thesis and remains in the hands of experts we concern ourselves with some theoretical as well data driven approaches. Having a well chosen set of characters describing a system of different entities such as homologous genes, i.e. genes of same origin in different species, we can build a phylogenetic tree. Consider the special case of gene clusters containing paralogous genes, i.e. genes of same origin within a species usually located closely, such as the well known HOX cluster. These are formed by step- wise duplication of its members, often involving unequal crossing over forming hybrid genes. Gene conversion and possibly other mechanisms of concerted evolution further obfuscate phylogenetic relationships. Hence, it is very difficult or even impossible to disentangle the detailed history of gene duplications in gene clusters. Expanding gene clusters that use unequal crossing over as proposed by Walter Gehring leads to distinctive patterns of genetic distances. We show that this special class of distances helps in extracting phylogenetic information from the data still. Disregarding genome rearrangements, we find that the shortest Hamiltonian path then coincides with the ordering of paralogous genes in a cluster. This observation can be used to detect ancient genomic rearrangements of gene clus- ters and to distinguish gene clusters whose evolution was dominated by unequal crossing over within genes from those that expanded through other mechanisms. While the evolution of DNA or protein sequences is well studied and can be formally described, we find that this does not hold for other systems such as language evolution. This is due to a lack of detectable mechanisms that drive the evolutionary processes in other fields. Hence, it is hard to quantify distances between entities, e.g. languages, and therefore the characters describing them. Starting out with distortions of distances, we first see that poor choices of the distance measure can lead to incorrect phylogenies. Given that phylogenetic inference requires additive metrics we can infer the correct phylogeny from a distance matrix D if there is a monotonic, subadditive function ζ such that ζ^−1(D) is additive. We compute the metric-preserving transformation ζ as the solution of an optimization problem. This result shows that the problem of phylogeny reconstruction is well defined even if a detailed mechanistic model of the evolutionary process is missing. Yet, this does not hinder studies of language evolution using automated tools. As the amount of available and large digital corpora increased so did the possibilities to study them automatically. The obvious parallels between historical linguistics and phylogenetics lead to many studies adapting bioinformatics tools to fit linguistics means. Here, we use jAlign to calculate bigram alignments, i.e. an alignment algorithm that operates with regard to adjacency of letters. Its performance is tested in different cognate recognition tasks. Using pairwise alignments one major obstacle is the systematic errors they make such as underestimation of gaps and their misplacement. Applying multiple sequence alignments instead of a pairwise algorithm implicitly includes more evolutionary information and thus can overcome the problem of correct gap placement. They can be seen as a generalization of the string-to-string edit problem to more than two strings. With the steady increase in computational power, exact, dynamic programming solutions have become feasible in practice also for 3- and 4-way alignments. For the pairwise (2-way) case, there is a clear distinction between local and global alignments. As more sequences are consid- ered, this distinction, which can in fact be made independently for both ends of each sequence, gives rise to a rich set of partially local alignment problems. So far these have remained largely unexplored. Thus, a general formal frame- work that gives raise to a classification of partially local alignment problems is introduced. It leads to a generic scheme that guides the principled design of exact dynamic programming solutions for particular partially local alignment problems

Qucosa - Publikationsserver der Universität Leipzig