Search CORE

44,971 research outputs found

Pattern avoidance in binary trees

Author: Deutsch
Donaghey
Eric S. Rowland
Flajolet
Flajolet
Goulden
Kauers
Noonan
Rowland
Sapounakis
Sloane
Stanley
Steyaert
Publication venue: 'Elsevier BV'
Publication date: 11/03/2010
Field of study

This paper considers the enumeration of trees avoiding a contiguous pattern. We provide an algorithm for computing the generating function that counts n-leaf binary trees avoiding a given binary tree pattern t. Equipped with this counting mechanism, we study the analogue of Wilf equivalence in which two tree patterns are equivalent if the respective n-leaf trees that avoid them are equinumerous. We investigate the equivalence classes combinatorially. Toward establishing bijective proofs of tree pattern equivalence, we develop a general method of restructuring trees that conjecturally succeeds to produce an explicit bijection for each pair of equivalent tree patterns.Comment: 19 pages, many images; published versio

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

The complexity of counting poset and permutation patterns

Author: Cooper Joshua
Kirkpatrick Anna
Publication venue
Publication date: 15/09/2014
Field of study

We introduce a notion of pattern occurrence that generalizes both classical permutation patterns as well as poset containment. Many questions about pattern statistics and avoidance generalize naturally to this setting, and we focus on functional complexity problems -- particularly those that arise by constraining the order dimensions of the pattern and text posets. We show that counting the number of induced, injective occurrences among dimension 2 posets is #P-hard; enumerating the linear extensions that occur in realizers of dimension 2 posets can be done in polynomial time, while for unconstrained dimension it is GI-complete; counting not necessarily induced, injective occurrences among dimension 2 posets is #P-hard; counting injective or not necessarily injective occurrences of an arbitrary pattern in a dimension 1 text is #P-hard, although it is in FP if the pattern poset is constrained to have bounded intrinsic width; and counting injective occurrences of a dimension 1 pattern in an arbitrary text is #P-hard, while it is in FP for bounded dimension texts. This framework easily leads to a number of open questions, chief among which are (1) is it #P-hard to count the number of occurrences of a dimension 2 pattern in a dimension 1 text, and (2) is it #P-hard to count the number of texts which avoid a given pattern?Comment: 15 page

arXiv.org e-Print Archive

CiteSeerX

Internal Pattern Matching Queries in a Text and Applications

Author: Kociumaka Tomasz
Radoszewski Jakub
Rytter Wojciech
Waleń Tomasz
Publication venue
Publication date: 13/10/2014
Field of study

We consider several types of internal queries: questions about subwords of a text. As the main tool we develop an optimal data structure for the problem called here internal pattern matching. This data structure provides constant-time answers to queries about occurrences of one subword

x

in another subword

y

of a given text, assuming that

|y|=\mathcal{O}(|x|)

, which allows for a constant-space representation of all occurrences. This problem can be viewed as a natural extension of the well-studied pattern matching problem. The data structure has linear size and admits a linear-time construction algorithm. Using the solution to the internal pattern matching problem, we obtain very efficient data structures answering queries about: primitivity of subwords, periods of subwords, general substring compression, and cyclic equivalence of two subwords. All these results improve upon the best previously known counterparts. The linear construction time of our data structure also allows to improve the algorithm for finding

\delta

-subrepetitions in a text (a more general version of maximal repetitions, also called runs). For any fixed

\delta

we obtain the first linear-time algorithm, which matches the linear time complexity of the algorithm computing runs. Our data structure has already been used as a part of the efficient solutions for subword suffix rank & selection, as well as substring compression using Burrows-Wheeler transform composed with run-length encoding.Comment: 31 pages, 9 figures; accepted to SODA 201

arXiv.org e-Print Archive

Crossref

Reverse-Safe Data Structures for Text Indexing

Author: Gabriele Fici
Giulia Bernardini
Grigorios Loukides
Huiping Chen
Solon P. Pissis
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2020
Field of study

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z-reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D. The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z, we propose an algorithm which constructs a z-reverse-safe data structure that has size O(n) and answers pattern matching queries of length at most d optimally, where d is maximal for any such z-reverse-safe data structure. The construction algorithm takes O(n ω log d) time, where ω is the matrix multiplication exponent. We show that, despite the n ω factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We further show that plugging our method in data analysis applications gives insignificant or no data utility loss. Finally, we show how our technique can be extended to support applications under a realistic adversary model

Archivio istituzionale della ricerca - Università di Trieste

Crossref

CWI's Institutional Repository

University of Birmingham Research Portal

Archivio istituzionale della ricerca - Università di Palermo

Universal Compressed Text Indexing

Author: Navarro Gonzalo
Prezza Nicola
Publication venue
Publication date: 06/09/2018
Field of study

The rise of repetitive datasets has lately generated a lot of interest in compressed self-indexes based on dictionary compression, a rich and heterogeneous family that exploits text repetitions in different ways. For each such compression scheme, several different indexing solutions have been proposed in the last two decades. To date, the fastest indexes for repetitive texts are based on the run-length compressed Burrows-Wheeler transform and on the Compact Directed Acyclic Word Graph. The most space-efficient indexes, on the other hand, are based on the Lempel-Ziv parsing and on grammar compression. Indexes for more universal schemes such as collage systems and macro schemes have not yet been proposed. Very recently, Kempa and Prezza [STOC 2018] showed that all dictionary compressors can be interpreted as approximation algorithms for the smallest string attractor, that is, a set of text positions capturing all distinct substrings. Starting from this observation, in this paper we develop the first universal compressed self-index, that is, the first indexing data structure based on string attractors, which can therefore be built on top of any dictionary-compressed text representation. Let

\gamma

be the size of a string attractor for a text of length

n

. Our index takes

O(\gamma\log(n/\gamma))

words of space and supports locating the

occ

occurrences of any pattern of length

m

O(m\log n + occ\log^{\epsilon}n)

time, for any constant

\epsilon>0

. This is, in particular, the first index for general macro schemes and collage systems. Our result shows that the relation between indexing and compression is much deeper than what was previously thought: the simple property standing at the core of all dictionary compressors is sufficient to support fast indexed queries.Comment: Fixed with reviewer's comment

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Repositorio Académico de la Universidad de Chile

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma