11,564 research outputs found

    Sliding Window Property Testing for Regular Languages

    Get PDF
    We study the problem of recognizing regular languages in a variant of the streaming model of computation, called the sliding window model. In this model, we are given a size of the sliding window n and a stream of symbols. At each time instant, we must decide whether the suffix of length n of the current stream ("the active window") belongs to a given regular language. Recent works [Moses Ganardi et al., 2018; Moses Ganardi et al., 2016] showed that the space complexity of an optimal deterministic sliding window algorithm for this problem is either constant, logarithmic or linear in the window size n and provided natural language theoretic characterizations of the space complexity classes. Subsequently, [Moses Ganardi et al., 2018] extended this result to randomized algorithms to show that any such algorithm admits either constant, double logarithmic, logarithmic or linear space complexity. In this work, we make an important step forward and combine the sliding window model with the property testing setting, which results in ultra-efficient algorithms for all regular languages. Informally, a sliding window property tester must accept the active window if it belongs to the language and reject it if it is far from the language. We show that for every regular language, there is a deterministic sliding window property tester that uses logarithmic space and a randomized sliding window property tester with two-sided error that uses constant space

    Low-Latency Sliding Window Algorithms for Formal Languages

    Get PDF
    Low-latency sliding window algorithms for regular and context-free languages are studied, where latency refers to the worst-case time spent for a single window update or query. For every regular language LL it is shown that there exists a constant-latency solution that supports adding and removing symbols independently on both ends of the window (the so-called two-way variable-size model). We prove that this result extends to all visibly pushdown languages. For deterministic 1-counter languages we present a O(logn)\mathcal{O}(\log n) latency sliding window algorithm for the two-way variable-size model where nn refers to the window size. We complement these results with a conditional lower bound: there exists a fixed real-time deterministic context-free language LL such that, assuming the OMV (online matrix vector multiplication) conjecture, there is no sliding window algorithm for LL with latency n1/2ϵn^{1/2-\epsilon} for any ϵ>0\epsilon>0, even in the most restricted sliding window model (one-way fixed-size model). The above mentioned results all refer to the unit-cost RAM model with logarithmic word size. For regular languages we also present a refined picture using word sizes O(1)\mathcal{O}(1), O(loglogn)\mathcal{O}(\log\log n), and O(logn)\mathcal{O}(\log n).Comment: A short version will be presented at the conference FSTTCS 202

    Property Testing of Regular Languages with Applications to Streaming Property Testing of Visibly Pushdown Languages

    Get PDF
    In this work, we revisit the problem of testing membership in regular languages, first studied by Alon et al. [Alon et al., 2001]. We develop a one-sided error property tester for regular languages under weighted edit distance that makes ?(?^{-1} log(1/?)) non-adaptive queries, assuming that the language is described by an automaton of constant size. Moreover, we show a matching lower bound, essentially closing the problem for the edit distance. As an application, we improve the space bound of the current best streaming property testing algorithm for visibly pushdown languages from ?(?^{-4} log? n) to ?(?^{-3} log? n log log n), where n is the size of the input. Finally, we provide a ?(max(?^{-1}, log n)) lower bound on the memory necessary to test visibly pushdown languages in the streaming model, significantly narrowing the gap between the known bounds

    Low-Latency Sliding Window Algorithms for Formal Languages

    Get PDF
    Low-latency sliding window algorithms for regular and context-free languages are studied, where latency refers to the worst-case time spent for a single window update or query. For every regular language L it is shown that there exists a constant-latency solution that supports adding and removing symbols independently on both ends of the window (the so-called two-way variable-size model). We prove that this result extends to all visibly pushdown languages. For deterministic 1-counter languages we present a ?(log n) latency sliding window algorithm for the two-way variable-size model where n refers to the window size. We complement these results with a conditional lower bound: there exists a fixed real-time deterministic context-free language L such that, assuming the OMV (online matrix vector multiplication) conjecture, there is no sliding window algorithm for L with latency n^(1/2-?) for any ? > 0, even in the most restricted sliding window model (one-way fixed-size model). The above mentioned results all refer to the unit-cost RAM model with logarithmic word size. For regular languages we also present a refined picture using word sizes ?(1), ?(log log n), and ?(log n)

    Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources

    Get PDF
    Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.Comment: SIGMOD'1

    Visibly Pushdown Languages over Sliding Windows

    Get PDF
    We investigate the class of visibly pushdown languages in the sliding window model. A sliding window algorithm for a language L receives a stream of symbols and has to decide at each time step whether the suffix of length n belongs to L or not. The window size n is either a fixed number (in the fixed-size model) or can be controlled by an adversary in a limited way (in the variable-size model). The main result of this paper states that for every visibly pushdown language the space complexity in the variable-size sliding window model is either constant, logarithmic or linear in the window size. This extends previous results for regular languages

    Subsequences in Bounded Ranges: Matching and Analysis Problems

    Full text link
    In this paper, we consider a variant of the classical algorithmic problem of checking whether a given word vv is a subsequence of another word ww. More precisely, we consider the problem of deciding, given a number pp (defining a range-bound) and two words vv and ww, whether there exists a factor w[i:i+p1]w[i:i+p-1] (or, in other words, a range of length pp) of ww having vv as subsequence (i.\,e., vv occurs as a subsequence in the bounded range w[i:i+p1]w[i:i+p-1]). We give matching upper and lower quadratic bounds for the time complexity of this problem. Further, we consider a series of algorithmic problems in this setting, in which, for given integers kk, pp and a word ww, we analyse the set pp-Subseqk(w)_{k}(w) of all words of length kk which occur as subsequence of some factor of length pp of ww. Among these, we consider the kk-universality problem, the kk-equivalence problem, as well as problems related to absent subsequences. Surprisingly, unlike the case of the classical model of subsequences in words where such problems have efficient solutions in general, we show that most of these problems become intractable in the new setting when subsequences in bounded ranges are considered. Finally, we provide an example of how some of our results can be applied to subsequence matching problems for circular words.Comment: Extended version of a paper which will appear in the proceedings of the 16th International Conference on Reachability Problems, RP 202
    corecore