1,104 research outputs found

    Permutation Complexity via Duality between Values and Orderings

    Full text link
    We study the permutation complexity of finite-state stationary stochastic processes based on a duality between values and orderings between values. First, we establish a duality between the set of all words of a fixed length and the set of all permutations of the same length. Second, on this basis, we give an elementary alternative proof of the equality between the permutation entropy rate and the entropy rate for a finite-state stationary stochastic processes first proved in [Amigo, J.M., Kennel, M. B., Kocarev, L., 2005. Physica D 210, 77-95]. Third, we show that further information on the relationship between the structure of values and the structure of orderings for finite-state stationary stochastic processes beyond the entropy rate can be obtained from the established duality. In particular, we prove that the permutation excess entropy is equal to the excess entropy, which is a measure of global correlation present in a stationary stochastic process, for finite-state stationary ergodic Markov processes.Comment: 26 page

    Permutation Complexity and Coupling Measures in Hidden Markov Models

    Get PDF
    In [Haruna, T. and Nakajima, K., 2011. Physica D 240, 1370-1377], the authors introduced the duality between values (words) and orderings (permutations) as a basis to discuss the relationship between information theoretic measures for finite-alphabet stationary stochastic processes and their permutation analogues. It has been used to give a simple proof of the equality between the entropy rate and the permutation entropy rate for any finite-alphabet stationary stochastic process and show some results on the excess entropy and the transfer entropy for finite-alphabet stationary ergodic Markov processes. In this paper, we extend our previous results to hidden Markov models and show the equalities between various information theoretic complexity and coupling measures and their permutation analogues. In particular, we show the following two results within the realm of hidden Markov models with ergodic internal processes: the two permutation analogues of the transfer entropy, the symbolic transfer entropy and the transfer entropy on rank vectors, are both equivalent to the transfer entropy if they are considered as the rates, and the directed information theory can be captured by the permutation entropy approach.Comment: 26 page

    Thermodynamic Analysis of Interacting Nucleic Acid Strands

    Get PDF
    Motivated by the analysis of natural and engineered DNA and RNA systems, we present the first algorithm for calculating the partition function of an unpseudoknotted complex of multiple interacting nucleic acid strands. This dynamic program is based on a rigorous extension of secondary structure models to the multistranded case, addressing representation and distinguishability issues that do not arise for single-stranded structures. We then derive the form of the partition function for a fixed volume containing a dilute solution of nucleic acid complexes. This expression can be evaluated explicitly for small numbers of strands, allowing the calculation of the equilibrium population distribution for each species of complex. Alternatively, for large systems (e.g., a test tube), we show that the unique complex concentrations corresponding to thermodynamic equilibrium can be obtained by solving a convex programming problem. Partition function and concentration information can then be used to calculate equilibrium base-pairing observables. The underlying physics and mathematical formulation of these problems lead to an interesting blend of approaches, including ideas from graph theory, group theory, dynamic programming, combinatorics, convex optimization, and Lagrange duality

    Convex Relaxations for Permutation Problems

    Full text link
    Seriation seeks to reconstruct a linear order between variables using unsorted, pairwise similarity information. It has direct applications in archeology and shotgun gene sequencing for example. We write seriation as an optimization problem by proving the equivalence between the seriation and combinatorial 2-SUM problems on similarity matrices (2-SUM is a quadratic minimization problem over permutations). The seriation problem can be solved exactly by a spectral algorithm in the noiseless case and we derive several convex relaxations for 2-SUM to improve the robustness of seriation solutions in noisy settings. These convex relaxations also allow us to impose structural constraints on the solution, hence solve semi-supervised seriation problems. We derive new approximation bounds for some of these relaxations and present numerical experiments on archeological data, Markov chains and DNA assembly from shotgun gene sequencing data.Comment: Final journal version, a few typos and references fixe

    The Complexity of Order Type Isomorphism

    Full text link
    The order type of a point set in RdR^d maps each (d+1)(d{+}1)-tuple of points to its orientation (e.g., clockwise or counterclockwise in R2R^2). Two point sets XX and YY have the same order type if there exists a mapping ff from XX to YY for which every (d+1)(d{+}1)-tuple (a1,a2,,ad+1)(a_1,a_2,\ldots,a_{d+1}) of XX and the corresponding tuple (f(a1),f(a2),,f(ad+1))(f(a_1),f(a_2),\ldots,f(a_{d+1})) in YY have the same orientation. In this paper we investigate the complexity of determining whether two point sets have the same order type. We provide an O(nd)O(n^d) algorithm for this task, thereby improving upon the O(n3d/2)O(n^{\lfloor{3d/2}\rfloor}) algorithm of Goodman and Pollack (1983). The algorithm uses only order type queries and also works for abstract order types (or acyclic oriented matroids). Our algorithm is optimal, both in the abstract setting and for realizable points sets if the algorithm only uses order type queries.Comment: Preliminary version of paper to appear at ACM-SIAM Symposium on Discrete Algorithms (SODA14

    A Dynamic Data Structure to Efficiently Find the Points below a Line and Estimate Their Number

    Get PDF
    A basic question in computational geometry is how to find the relationship between a set of points and a line in a real plane. In this paper, we present multidimensional data structures for N points that allow answering the following queries for any given input line: (1) estimate in O(log N) time the number of points below the line; (2) return in O(log N + k) time the k ≤ N points that are below the line; and (3) return in O(log N) time the point that is closest to the line. We illustrate the utility of this computational question with GIS applications in air defense and traffic control
    corecore