77 research outputs found
Understanding Probabilistic Programs
We present two views of probabilistic programs and their relationship. An operational interpretation as well as a weakest pre-condition semantics are provided for an elementary probabilistic guarded command language. Our study treats important features such as sampling, conditioning, loop divergence, and non-determinism
Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction
Janssen S, Schudoma C, Steger G, Giegerich R. Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction. BMC Bioinformatics. 2011;12(1): 429.BACKGROUND:Many bioinformatics tools for RNA secondary structure analysis are based on a thermodynamic model of RNA folding. They predict a single, "optimal" structure by free energy minimization, they enumerate near-optimal structures, they compute base pair probabilities and dot plots, representative structures of different abstract shapes, or Boltzmann probabilities of structures and shapes. Although all programs refer to the same physical model, they implement it with considerable variation for different tasks, and little is known about the effects of heuristic assumptions and model simplifications used by the programs on the outcome of the analysis.RESULTS:We extract four different models of the thermodynamic folding space which underlie the programs RNAfold, RNAshapes, and RNAsubopt. Their differences lie within the details of the energy model and the granularity of the folding space. We implement probabilistic shape analysis for all models, and introduce the shape probability shift as a robust measure of model similarity. Using four data sets derived from experimentally solved structures, we provide a quantitative evaluation of the model differences.CONCLUSIONS:We find that search space granularity affects the computed shape probabilities less than the over- or underapproximation of free energy by a simplified energy model. Still, the approximations perform similar enough to implementations of the full model to justify their continued use in settings where computational constraints call for simpler algorithms. On the side, we observe that the rarely used level 2 shapes, which predict the complete arrangement of helices, multiloops, internal loops and bulges, include the "true" shape in a rather small number of predicted high probability shapes. This calls for an investigation of new strategies to extract high probability members from the (very large) level 2 shape space of an RNA sequence. We provide implementations of all four models, written in a declarative style that makes them easy to be modified. Based on our study, future work on thermodynamic RNA folding may make a choice of model based on our empirical data. It can take our implementations as a starting point for further program development
Query Complexity of Inversion Minimization on Trees
We consider the following computational problem: Given a rooted tree and a
ranking of its leaves, what is the minimum number of inversions of the leaves
that can be attained by ordering the tree? This variation of the problem of
counting inversions in arrays originated in mathematical psychology, with the
evaluation of the Mann--Whitney statistic for detecting differences between
distributions as a special case.
We study the complexity of the problem in the comparison-query model, used
for problems like sorting and selection. For many types of trees with
leaves, we establish lower bounds close to the strongest known in the model,
namely the lower bound of for sorting items. We show:
(a) queries are needed whenever
the tree has a subtree that contains a fraction of the leaves. This
implies a lower bound of for trees
of degree .
(b) queries are needed in case the tree is binary.
(c) queries are needed for certain classes of
trees of degree , including perfect trees with even .
The lower bounds are obtained by developing two novel techniques for a
generic problem in the comparison-query model and applying them to
inversion minimization on trees. Both techniques can be described in terms of
the Cayley graph of the symmetric group with adjacent-rank transpositions as
the generating set. Consider the subgraph consisting of the edges between
vertices with the same value under . We show that the size of any decision
tree for must be at least:
(i) the number of connected components of this subgraph, and
(ii) the factorial of the average degree of the complementary subgraph,
divided by .
Lower bounds on query complexity then follow by taking the base-2 logarithm.Comment: 54 pages, 18 figures, full version of paper appearing in the
Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithm
Torsion volume forms
We introduce volume forms on mapping stacks in derived algebraic geometry
using a parametrized version of the Reidemeister-Turaev torsion. In the case of
derived loop stacks we describe this volume form in terms of the Todd class. In
the case of mapping stacks from surfaces, we compare it to the symplectic
volume form. As an application of these ideas, we construct canonical
orientation data for cohomological DT invariants of closed oriented
3-manifolds.Comment: 58 page
Category Theoretic Models of Data Refinement
We give an account of the use of category theory in modelling data refinement over the past twenty years. We start with Tony Hoare's formulation of data refinement in category theoretic terms, explain how the category theory may be made precise in generality and with elegance, using the notion of structure respecting lax transformation, for a first order imperative language, then study two main alternatives for extending that category theoretic analysis in order to account for higher order languages. The first is given by adjoint simulations; the second is given by the notion of lax logical relation. These provide techniques that can be used for a combined language, such as an imperative language with procedure passing.18 page(s
- …