764 research outputs found

    Counting patterns in strings and graphs

    Get PDF
    We study problems related to finding and counting patterns in strings and graphs. In the string-regime, we are interested in counting how many substring of a text are at Hamming (or edit) distance at most to a pattern . Among others, we are interested in the fully-compressed setting, where both and are given in a compressed representation. For both distance measures, we give the first algorithm that runs in (almost) linear time in the size of the compressed representations. We obtain the algorithms by new and tight structural insights into the solution structure of the problems. In the graph-regime, we study problems related to counting homomorphisms between graphs. In particular, we study the parameterized complexity of the problem #IndSub(), where we are to count all -vertex induced subgraphs of a graph that satisfy the property . Based on a theory of Lovász, Curticapean et al., we express #IndSub() as a linear combination of graph homomorphism numbers to obtain #W[1]-hardness and almost tight conditional lower bounds for properties that are monotone or that depend only on the number of edges of a graph. Thereby, we prove a conjecture by Jerrum and Meeks. In addition, we investigate the parameterized complexity of the problem #Hom(ℋ → ) for graph classes ℋ and . In particular, we show that for any problem in the class #W[1], there are classes ℋ_ and _ such that is equivalent to #Hom(ℋ_ → _ ).Wir untersuchen Probleme im Zusammenhang mit dem Finden und Zählen von Mustern in Strings und Graphen. Im Stringbereich ist die Aufgabe, alle Teilstrings eines Strings zu bestimmen, die eine Hamming- (oder Editier-)Distanz von höchstens zu einem Pattern haben. Unter anderem sind wir am voll-komprimierten Setting interessiert, in dem sowohl , als auch in komprimierter Form gegeben sind. Für beide Abstandsbegriffe entwickeln wir die ersten Algorithmen mit einer (fast) linearen Laufzeit in der Größe der komprimierten Darstellungen. Die Algorithmen nutzen neue strukturelle Einsichten in die Lösungsstruktur der Probleme. Im Graphenbereich betrachten wir Probleme im Zusammenhang mit dem Zählen von Homomorphismen zwischen Graphen. Im Besonderen betrachten wir das Problem #IndSub(), bei dem alle induzierten Subgraphen mit Knoten zu zählen sind, die die Eigenschaft haben. Basierend auf einer Theorie von Lovász, Curticapean, Dell, and Marx drücken wir #IndSub() als Linearkombination von Homomorphismen-Zahlen aus um #W[1]-Härte und fast scharfe konditionale untere Laufzeitschranken zu erhalten für , die monoton sind oder nur auf der Kantenanzahl der Graphen basieren. Somit beweisen wir eine Vermutung von Jerrum and Meeks. Weiterhin beschäftigen wir uns mit der Komplexität des Problems #Hom(ℋ → ) für Graphklassen ℋ und . Im Besonderen zeigen wir, dass es für jedes Problem in #W[1] Graphklassen ℋ_ und _ gibt, sodass äquivalent zu #Hom(ℋ_ → _ ) ist

    Table of Contents and Editorial Board

    Get PDF

    The Effects of Using Word Sorts in Combination with iPad Spelling Applications on Spelling Acquisition of a Student with a Specific Learning Disability

    Get PDF
    This research studied the effects of using word sorts in combination with iPad spelling applications on the spelling acquisition of a third grade student with a Specific Learning Disability. The researcher measured the effects on spelling acquisition using pre- and post-assessments. The informal assessments used included a Words their Way Spelling assessment, a Nonsense word assessment from Teaching Phonics and Word Study in Intermediate Grades, and a questionnaire. The student engaged in word sorts prior to completing a spelling application on the iPad. The strategies implemented with gains in spelling skills and had a positive effect on the student\u27s attitude towards spelling. The research also showed positive effects in the student\u27s reading of nonsense words. More research applying these techniques to students with and without disabilities should be conducted to further explore these approaches

    On Near-Linear-Time Algorithms for Dense Subset Sum

    Get PDF
    In the Subset Sum problem we are given a set of nn positive integers XX and a target tt and are asked whether some subset of XX sums to tt. Natural parameters for this problem that have been studied in the literature are nn and tt as well as the maximum input number mxX\rm{mx}_X and the sum of all input numbers ΣX\Sigma_X. In this paper we study the dense case of Subset Sum, where all these parameters are polynomial in nn. In this regime, standard pseudo-polynomial algorithms solve Subset Sum in polynomial time nO(1)n^{O(1)}. Our main question is: When can dense Subset Sum be solved in near-linear time O~(n)\tilde{O}(n)? We provide an essentially complete dichotomy by designing improved algorithms and proving conditional lower bounds, thereby determining essentially all settings of the parameters n,t,mxX,ΣXn,t,\rm{mx}_X,\Sigma_X for which dense Subset Sum is in time O~(n)\tilde{O}(n). For notational convenience we assume without loss of generality that tmxXt \ge \rm{mx}_X (as larger numbers can be ignored) and tΣX/2t \le \Sigma_X/2 (using symmetry). Then our dichotomy reads as follows: - By reviving and improving an additive-combinatorics-based approach by Galil and Margalit [SICOMP'91], we show that Subset Sum is in near-linear time O~(n)\tilde{O}(n) if tmxXΣX/n2t \gg \rm{mx}_X \Sigma_X/n^2. - We prove a matching conditional lower bound: If Subset Sum is in near-linear time for any setting with tmxXΣX/n2t \ll \rm{mx}_X \Sigma_X/n^2, then the Strong Exponential Time Hypothesis and the Strong k-Sum Hypothesis fail. We also generalize our algorithm from sets to multi-sets, albeit with non-matching upper and lower bounds

    Clique-Based Lower Bounds for Parsing Tree-Adjoining Grammars

    Get PDF
    up to lower order factors

    Faster Approximate Pattern Matching: {A} Unified Approach

    Get PDF
    Approximate pattern matching is a natural and well-studied problem on strings: Given a text TT, a pattern PP, and a threshold kk, find (the starting positions of) all substrings of TT that are at distance at most kk from PP. We consider the two most fundamental string metrics: the Hamming distance and the edit distance. Under the Hamming distance, we search for substrings of TT that have at most kk mismatches with PP, while under the edit distance, we search for substrings of TT that can be transformed to PP with at most kk edits. Exact occurrences of PP in TT have a very simple structure: If we assume for simplicity that T3P/2|T| \le 3|P|/2 and trim TT so that PP occurs both as a prefix and as a suffix of TT, then both PP and TT are periodic with a common period. However, an analogous characterization for the structure of occurrences with up to kk mismatches was proved only recently by Bringmann et al. [SODA'19]: Either there are O(k2)O(k^2) kk-mismatch occurrences of PP in TT, or both PP and TT are at Hamming distance O(k)O(k) from strings with a common period O(m/k)O(m/k). We tighten this characterization by showing that there are O(k)O(k) kk-mismatch occurrences in the case when the pattern is not (approximately) periodic, and we lift it to the edit distance setting, where we tightly bound the number of kk-edit occurrences by O(k2)O(k^2) in the non-periodic case. Our proofs are constructive and let us obtain a unified framework for approximate pattern matching for both considered distances. We showcase the generality of our framework with results for the fully-compressed setting (where TT and PP are given as a straight-line program) and for the dynamic setting (where we extend a data structure of Gawrychowski et al. [SODA'18])

    Faster Approximate Pattern Matching: A Unified Approach

    Get PDF
    Approximate pattern matching is a natural and well-studied problem on strings: Given a text TT, a pattern PP, and a threshold kk, find (the starting positions of) all substrings of TT that are at distance at most kk from PP. We consider the two most fundamental string metrics: the Hamming distance and the edit distance. Under the Hamming distance, we search for substrings of TT that have at most kk mismatches with PP, while under the edit distance, we search for substrings of TT that can be transformed to PP with at most kk edits. Exact occurrences of PP in TT have a very simple structure: If we assume for simplicity that T3P/2|T| \le 3|P|/2 and trim TT so that PP occurs both as a prefix and as a suffix of TT, then both PP and TT are periodic with a common period. However, an analogous characterization for the structure of occurrences with up to kk mismatches was proved only recently by Bringmann et al. [SODA'19]: Either there are O(k2)O(k^2) kk-mismatch occurrences of PP in TT, or both PP and TT are at Hamming distance O(k)O(k) from strings with a common period O(m/k)O(m/k). We tighten this characterization by showing that there are O(k)O(k) kk-mismatch occurrences in the case when the pattern is not (approximately) periodic, and we lift it to the edit distance setting, where we tightly bound the number of kk-edit occurrences by O(k2)O(k^2) in the non-periodic case. Our proofs are constructive and let us obtain a unified framework for approximate pattern matching for both considered distances. We showcase the generality of our framework with results for the fully-compressed setting (where TT and PP are given as a straight-line program) and for the dynamic setting (where we extend a data structure of Gawrychowski et al. [SODA'18]).Comment: 74 pages, 7 figures, FOCS'2

    Detecting and counting small subgraphs, and evaluating a parameterized Tutte polynomial: lower bounds via toroidal grids and Cayley graph expanders

    Get PDF
    Given a graph property Φ\Phi, we consider the problem EdgeSub(Φ)\mathtt{EdgeSub}(\Phi), where the input is a pair of a graph GG and a positive integer kk, and the task is to decide whether GG contains a kk-edge subgraph that satisfies Φ\Phi. Specifically, we study the parameterized complexity of EdgeSub(Φ)\mathtt{EdgeSub}(\Phi) and of its counting problem #EdgeSub(Φ)\#\mathtt{EdgeSub}(\Phi) with respect to both approximate and exact counting. We obtain a complete picture for minor-closed properties Φ\Phi: the decision problem EdgeSub(Φ)\mathtt{EdgeSub}(\Phi) always admits an FPT algorithm and the counting problem #EdgeSub(Φ)\#\mathtt{EdgeSub}(\Phi) always admits an FPTRAS. For exact counting, we present an exhaustive and explicit criterion on the property Φ\Phi which, if satisfied, yields fixed-parameter tractability and otherwise #W[1]\#\mathsf{W[1]}-hardness. Additionally, most of our hardness results come with an almost tight conditional lower bound under the so-called Exponential Time Hypothesis, ruling out algorithms for #EdgeSub(Φ)\#\mathtt{EdgeSub}(\Phi) that run in time f(k)Go(k/logk)f(k)\cdot|G|^{o(k/\log k)} for any computable function ff. As a main technical result, we gain a complete understanding of the coefficients of toroidal grids and selected Cayley graph expanders in the homomorphism basis of #EdgeSub(Φ)\#\mathtt{EdgeSub}(\Phi). This allows us to establish hardness of exact counting using the Complexity Monotonicity framework due to Curticapean, Dell and Marx (STOC'17). Our methods can also be applied to a parameterized variant of the Tutte Polynomial TGkT^k_G of a graph GG, to which many known combinatorial interpretations of values of the (classical) Tutte Polynomial can be extended. As an example, TGk(2,1)T^k_G(2,1) corresponds to the number of kk-forests in the graph GG. Our techniques allow us to completely understand the parametrized complexity of computing the evaluation of TGkT^k_G at every pair of rational coordinates (x,y)(x,y)

    Faster Pattern Matching under Edit Distance

    Get PDF
    We consider the approximate pattern matching problem under the edit distance.Given a text TT of length nn, a pattern PP of length mm, and a thresholdkk, the task is to find the starting positions of all substrings of TT thatcan be transformed to PP with at most kk edits. More than 20 years ago, Coleand Hariharan [SODA'98, J. Comput.'02] gave an O(n+k4n/m)\mathcal{O}(n+k^4 \cdot n/m)-time algorithm for this classic problem, and this runtime has not beenimproved since. Here, we present an algorithm that runs in time O(n+k3.5logmlogkn/m)\mathcal{O}(n+k^{3.5}\sqrt{\log m \log k} \cdot n/m), thus breaking through this long-standingbarrier. In the case where n^{1/4+\varepsilon} \leq k \leqn^{2/5-\varepsilon} for some arbitrarily small positive constantε\varepsilon, our algorithm improves over the state-of-the-art by polynomialfactors: it is polynomially faster than both the algorithm of Cole andHariharan and the classic O(kn)\mathcal{O}(kn)-time algorithm of Landau andVishkin [STOC'86, J. Algorithms'89]. We observe that the bottleneck case of the alternative O(n+k4n/m)\mathcal{O}(n+k^4\cdot n/m)-time algorithm of Charalampopoulos, Kociumaka, and Wellnitz[FOCS'20] is when the text and the pattern are (almost) periodic. Our newalgorithm reduces this case to a new dynamic problem (Dynamic Puzzle Matching),which we solve by building on tools developed by Tiskin [SODA'10,Algorithmica'15] for the so-called seaweed monoid of permutation matrices. Ouralgorithm relies only on a small set of primitive operations on strings andthus also applies to the fully-compressed setting (where text and pattern aregiven as straight-line programs) and to the dynamic setting (where we maintaina collection of strings under creation, splitting, and concatenation),improving over the state of the art.<br
    corecore