    Structural and Computational Existence Results for Multidimensional Subshifts

    Symbolic dynamics is a branch of mathematics that studies the structure of infinite sequences of symbols, or in the multidimensional case, infinite grids of symbols. Classes of such sequences and grids defined by collections of forbidden patterns are called subshifts, and subshifts of finite type are defined by finitely many forbidden patterns. The simplest examples of multidimensional subshifts are sets of Wang tilings, infinite arrangements of square tiles with colored edges, where adjacent edges must have the same color. Multidimensional symbolic dynamics has strong connections to computability theory, since most of the basic properties of subshifts cannot be recognized by computer programs, but are instead characterized by some higher-level notion of computability. This dissertation focuses on the structure of multidimensional subshifts, and the ways in which it relates to their computational properties. In the first part, we study the subpattern posets and Cantor-Bendixson ranks of countable subshifts of finite type, which can be seen as measures of their structural complexity. We show, by explicitly constructing subshifts with the desired properties, that both notions are essentially restricted only by computability conditions. In the second part of the dissertation, we study different methods of defining (classes of ) multidimensional subshifts, and how they relate to each other and existing methods. We present definitions that use monadic second-order logic, a more restricted kind of logical quantification called quantifier extension, and multi-headed finite state machines. Two of the definitions give rise to hierarchies of subshift classes, which are a priori infinite, but which we show to collapse into finitely many levels. The quantifier extension provides insight to the somewhat mysterious class of multidimensional sofic subshifts, since we prove a characterization for the class of subshifts that can extend a sofic subshift into a nonsofic one.Symbolidynamiikka on matematiikan ala, joka tutkii äärettömän pituisten symbolijonojen ominaisuuksia, tai moniulotteisessa tapauksessa äärettömän laajoja symbolihiloja. Siirtoavaruudet ovat tällaisten jonojen tai hilojen kokoelmia, jotka on määritelty kieltämällä jokin joukko äärellisen kokoisia kuvioita, ja äärellisen tyypin siirtoavaruudet saadaan kieltämällä vain äärellisen monta kuviota. Wangin tiilitykset ovat yksinkertaisin esimerkki moniulotteisista siirtoavaruuksista. Ne ovat värillisistä neliöistä muodostettuja tiilityksiä, joissa kaikkien vierekkäisten sivujen on oltava samanvärisiä. Moniulotteinen symbolidynamiikka on vahvasti yhteydessä laskettavuuden teoriaan, sillä monia siirtoavaruuksien perusominaisuuksia ei ole mahdollista tunnistaa tietokoneohjelmilla, vaan korkeamman tason laskennallisilla malleilla. Väitöskirjassani tutkin moniulotteisten siirtoavaruuksien rakennetta ja sen suhdetta niiden laskennallisiin ominaisuuksiin. Ensimmäisessä osassa keskityn tiettyihin äärellisen tyypin siirtoavaruuksien rakenteellisiin ominaisuuksiin: äärellisten kuvioiden muodostamaan järjestykseen ja Cantor-Bendixsonin astelukuun. Halutunlaisia siirtoavaruuksia rakentamalla osoitan, että molemmat ominaisuudet ovat olennaisesti laskennallisten ehtojen rajoittamia. Väitöskirjan toisessa osassa tutkin erilaisia tapoja määritellä moniulotteisia siirtoavaruuksia, sekä sitä, miten nämä tavat vertautuvat toisiinsa ja tunnettuihin siirtoavaruuksien luokkiin. Käsittelen määritelmiä, jotka perustuvat toisen kertaluvun logiikkaan, kvanttorilaajennukseksi kutsuttuun rajoitettuun loogiseen kvantifiointiin, sekä monipäisiin äärellisiin automaatteihin. Näistä kolmesta määritelmästä kahteen liittyy erilliset siirtoavaruuksien hierarkiat, joiden todistan romahtavan äärellisen korkuisiksi. Kvanttorilaajennuksen tutkimus valottaa myös niin kutsuttujen sofisten siirtoavaruuksien rakennetta, jota ei vielä tunneta hyvin: kyseisessä luvussa selvitän tarkasti, mitkä siirtoavaruudet voivat laajentaa sofisen avaruuden ei-sofiseksi.Siirretty Doriast

    Graph algorithms for the haplotyping problem

    Evidence from investigations of genetic differences among human beings shows that genetic diseases are often the results of genetic mutations. The most common form of these mutations is single nucleotide polymorphism (SNP). A complete map of all SNPs in the human genome will be extremely valuable for studying the relationships between specific haplotypes and specific genetic diseases. Some recent discoveries show that the DNA sequence of human beings can be partitioned into long blocks where genetic recombination has been rare. Then, inferring both haplotypes from chromosome sequences is a biologically meaningful research topic, which has compounded mathematical and computational problems.;We are interested in the algorithmic implications to infer haplotypes from long blocks of DNA that have not undergone recombination in populations. The assumption justifies a model of haplotype evolution---haplotypes in a population evolves along a coalescent, based on the standard population-genetic assumption of infinite sites, which as a rooted tree is a perfect phylogeny. The Perfect Phylogeny Haplotyping (PPH) Problem was introduced by Daniel Gusfield in 2002. A nearly linear-time solution to the PPH problem (O( nmalpha(nm)), where alpha is the extremely slowly growing inverse Ackerman function) is provided. However, it is very complex and difficult to implement. So far, even the best practical solution to the PPH problem has the worst-case running time of O( nm2). D. Gusfield conjectured that a linear-time ( O(nm)) solution to the PPH problem should be possible.;We solve the conjecture of Gusfield by introducing a linear-time algorithm for the PPH problem. Different kinds of posets for haplotype matrices and genotype matrices are designed and the relationships between them are studied. Since redundant calculations can be avoided by the transitivity of partial ordering in posets, we design a linear-time (O(nm )) algorithm for the PPH problem that provides all the possible solutions from an input. The algorithm is fully implemented and the simulation shows that it is much faster than previous methods

    Twin-width I: tractable FO model checking

    Inspired by a width invariant defined on permutations by Guillemot and Marx [SODA '14], we introduce the notion of twin-width on graphs and on matrices. Proper minor-closed classes, bounded rank-width graphs, map graphs, KtK_t-free unit dd-dimensional ball graphs, posets with antichains of bounded size, and proper subclasses of dimension-2 posets all have bounded twin-width. On all these classes (except map graphs without geometric embedding) we show how to compute in polynomial time a sequence of dd-contractions, witness that the twin-width is at most dd. We show that FO model checking, that is deciding if a given first-order formula ϕ\phi evaluates to true for a given binary structure GG on a domain DD, is FPT in ϕ|\phi| on classes of bounded twin-width, provided the witness is given. More precisely, being given a dd-contraction sequence for GG, our algorithm runs in time f(d,ϕ)Df(d,|\phi|) \cdot |D| where ff is a computable but non-elementary function. We also prove that bounded twin-width is preserved by FO interpretations and transductions (allowing operations such as squaring or complementing a graph). This unifies and significantly extends the knowledge on fixed-parameter tractability of FO model checking on non-monotone classes, such as the FPT algorithm on bounded-width posets by Gajarsk\'y et al. [FOCS '15].Comment: 49 pages, 9 figure

    Continuous reductions on the Scott domain and Decomposability Conjecture

    Causal Fourier Analysis on Directed Acyclic Graphs and Posets

    We present a novel form of Fourier analysis, and associated signal processing concepts, for signals (or data) indexed by edge-weighted directed acyclic graphs (DAGs). This means that our Fourier basis yields an eigendecomposition of a suitable notion of shift and convolution operators that we define. DAGs are the common model to capture causal relationships between data values and in this case our proposed Fourier analysis relates data with its causes under a linearity assumption that we define. The definition of the Fourier transform requires the transitive closure of the weighted DAG for which several forms are possible depending on the interpretation of the edge weights. Examples include level of influence, distance, or pollution distribution. Our framework is different from prior GSP: it is specific to DAGs and leverages, and extends, the classical theory of Moebius inversion from combinatorics. For a prototypical application we consider DAGs modeling dynamic networks in which edges change over time. Specifically, we model the spread of an infection on such a DAG obtained from real-world contact tracing data and learn the infection signal from samples assuming sparsity in the Fourier domain.Comment: 13 pages, 11 figure

    Some Take-Away Games on Discrete Structures

    The game of Subset Take-Away is an impartial combinatorial game posed by David Gale in 1974. The game can be played on various discrete structures, including but not limited to graphs, hypergraphs, polygonal complexes, and partially ordered sets. While a universal winning strategy has yet to be found, results have been found in certain cases. In 2003 R. Riehemann focused on Subset Take-Away on bipartite graphs and produced a complete game analysis by studying nim-values. In this work, we extend the notion of Take-Away on a bipartite graph to Take-Away on particular hypergraphs, namely oddly-uniform hypergraphs and evenly-uniform hypergraphs whose vertices satisfy a particular coloring condition. On both structures we provide a complete game analysis via nim-values. From here, we consider different discrete structures and slight variations of the rules for Take-Away to produce some interesting results. Under certain conditions, polygonal complexes exhibit a sequence of nim-values which are unbounded but have a well-behaved pattern. Under other conditions, the nim-value of a polygonal complex is bounded and predictable based on information about the complex itself. We introduce a Take-Away variant which we call “Take-As-Much-As-You-Want”, and we show that, again, nim-values can grow without bound, but fortunately they are very easily described for a given graph based on the total number of vertices and edges of the graph. Finally we consider Take-Away on a specific type of partially ordered set which we call a rank-complete poset. We have results, again via an analysis of nim-values, for Take-Away on a rank-complete poset for both ordinary play as well as misère play

    Recoloring Interval Graphs with Limited Recourse Budget

    We consider the problem of coloring an interval graph dynamically. Intervals arrive one after the other and have to be colored immediately such that no two intervals of the same color overlap. In each step only a limited number of intervals may be recolored to maintain a proper coloring (thus interpolating between the well-studied online and offline settings). The number of allowed recolorings per step is the so-called recourse budget. Our main aim is to prove both upper and lower bounds on the required recourse budget for interval graphs, given a bound on the allowed number of colors. For general interval graphs with n vertices and chromatic number k it is known that some recoloring is needed even if we have 2k colors available. We give an algorithm that maintains a 2k-coloring with an amortized recourse budget of 1˘d4aa(logn)\u1d4aa(log n). For maintaining a k-coloring with k ≤ n, we give an amortized upper bound of \u1d4aa(k⋅ k! ⋅ √n), and a lower bound of Ω(k)fork1˘d4aa(n)Ω(k) for k ∈ \u1d4aa(√n), which can be as large as Ω(nΩ(√n). For unit interval graphs it is known that some recoloring is needed even if we have k+1 colors available. We give an algorithm that maintains a (k+1)-coloring with at most 1˘d4aa(k2)\u1d4aa(k²) recolorings per step in the worst case. We also give a lower bound of Ω(logn)Ω(log n) on the amortized recourse budget needed to maintain a k-coloring. Additionally, for general interval graphs we show that if one does not insist on maintaining an explicit coloring, one can have a k-coloring algorithm which does not incur a factor of 1˘d4aa(kk!n)\u1d4aa(k ⋅ k! ⋅ √n) in the running time. For this we provide a data structure, which allows for adding intervals in 1˘d4aa(k2log3n)\u1d4aa(k² log³ n) amortized time per update and querying for the color of a particular interval in 1˘d4aa(logn)time\u1d4aa(log n) time. Between any two updates, the data structure answers consistently with some optimal coloring. The data structure maintains the coloring implicitly, so the notion of recourse budget does not apply to it