331 research outputs found

    The Heaviest Induced Ancestors Problem Revisited

    Get PDF
    We revisit the heaviest induced ancestors problem, which has several interesting applications in string matching. Let T_1 and T_2 be two weighted trees, where the weight W(u) of a node u in either of the two trees is more than the weight of u\u27s parent. Additionally, the leaves in both trees are labeled and the labeling of the leaves in T_2 is a permutation of those in T_1. A node x in T_1 and a node y in T_2 are induced, iff their subtree have at least one common leaf label. A heaviest induced ancestor query HIA(u_1,u_2) is: given a node u_1 in T_1 and a node u_2 in T_2, output the pair (u_1^*,u_2^*) of induced nodes with the highest combined weight W(u^*_1) + W(u^*_2), such that u_1^* is an ancestor of u_1 and u^*_2 is an ancestor of u_2. Let n be the number of nodes in both trees combined and epsilon >0 be an arbitrarily small constant. Gagie et al. [CCCG\u27 13] introduced this problem and proposed three solutions with the following space-time trade-offs: - an O(n log^2n)-word data structure with O(log n log log n) query time - an O(n log n)-word data structure with O(log^2 n) query time - an O(n)-word data structure with O(log^{3+epsilon}n) query time. In this paper, we revisit this problem and present new data structures, with improved bounds. Our results are as follows. - an O(n log n)-word data structure with O(log n log log n) query time - an O(n)-word data structure with O(log^2 n/log log n) query time. As a corollary, we also improve the LZ compressed index of Gagie et al. [CCCG\u27 13] for answering longest common substring (LCS) queries. Additionally, we show that the LCS after one edit problem of size n [Amir et al., SPIRE\u27 17] can also be reduced to the heaviest induced ancestors problem over two trees of n nodes in total. This yields a straightforward improvement over its current solution of O(n log^3 n) space and O(log^3 n) query time

    Range Shortest Unique Substring queries

    Get PDF
    Let be a string of length n and be the substring of starting at position i and ending at position j. A substring of is a repeat if it occurs more than once in; otherwise, it is a unique substring of. Repeats and unique substrings are of great interest in computational biology and in information retrieval. Given string as input, the Shortest Unique Substring problem is to find a shortest substring of that does not occur elsewhere in. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over answering the following type of online queries efficiently. Given a range, return a shortest substring of with exactly one occurrence in. We present an -word data structure with query time, where is the word size. Our construction is based on a non-trivial reduction allowing us to apply a recently introduced optimal geometric data structure [Chan et al. ICALP 2018]

    Panel on “Past and future of computer science theory”

    Get PDF
    The twenty-ninth edition of the SEBD (Italian Symposium on Advanced Database Systems), held on 5-9 September 2021 in Pizzo (Calabria Region, Italy), included a joint seminar on “Reminiscence of TIDB 1981” with invited talks given by some of the participants to the Advanced Seminar on Theoretical Issues in Databases (TIDB), which took place in the same region exactly forty years earlier. The joint seminar was concluded by a Panel on “The Past and the Future of Computer Science Theory” with the participation of four distinguished computer science theorists (Ronald Fagin, Georg Gottlob, Christos Papadimitriou and Moshe Vardi), who were interviewed by Giorgio Ausiello, Maurizio Lenzerini, Luigi Palopoli, Domenico Saccà and Francesco Scarcello. This paper reports the summaries of the four interviews

    Polynomial Vector Addition Systems With States

    Get PDF
    The reachability problem for vector addition systems is one of the most difficult and central problems in theoretical computer science. The problem is known to be decidable, but despite intense investigation during the last four decades, the exact complexity is still open. For some sub-classes, the complexity of the reachability problem is known. Structurally bounded vector addition systems, the class of vector addition systems with finite reachability sets from any initial configuration, is one of those classes. In fact, the reachability problem was shown to be polynomial-space complete for that class by Praveen and Lodaya in 2008. Surprisingly, extending this property to vector addition systems with states is open. In fact, there exist vector addition systems with states that are structurally bounded but with Ackermannian large sets of reachable configurations. It follows that the reachability problem for that class is between exponential space and Ackermannian. In this paper we introduce the class of polynomial vector addition systems with states, defined as the class of vector addition systems with states with size of reachable configurations bounded polynomially in the size of the initial ones. We prove that the reachability problem for polynomial vector addition systems is exponential-space complete. Additionally, we show that we can decide in polynomial time if a vector addition system with states is polynomial. This characterization introduces the notion of iteration scheme with potential applications to the reachability problem for general vector addition systems

    Approximating Cumulative Pebbling Cost Is Unique Games Hard

    Get PDF
    The cumulative pebbling complexity of a directed acyclic graph GG is defined as cc(G)=minPiPi\mathsf{cc}(G) = \min_P \sum_i |P_i|, where the minimum is taken over all legal (parallel) black pebblings of GG and Pi|P_i| denotes the number of pebbles on the graph during round ii. Intuitively, cc(G)\mathsf{cc}(G) captures the amortized Space-Time complexity of pebbling mm copies of GG in parallel. The cumulative pebbling complexity of a graph GG is of particular interest in the field of cryptography as cc(G)\mathsf{cc}(G) is tightly related to the amortized Area-Time complexity of the Data-Independent Memory-Hard Function (iMHF) fG,Hf_{G,H} [AS15] defined using a constant indegree directed acyclic graph (DAG) GG and a random oracle H()H(\cdot). A secure iMHF should have amortized Space-Time complexity as high as possible, e.g., to deter brute-force password attacker who wants to find xx such that fG,H(x)=hf_{G,H}(x) = h. Thus, to analyze the (in)security of a candidate iMHF fG,Hf_{G,H}, it is crucial to estimate the value cc(G)\mathsf{cc}(G) but currently, upper and lower bounds for leading iMHF candidates differ by several orders of magnitude. Blocki and Zhou recently showed that it is NP\mathsf{NP}-Hard to compute cc(G)\mathsf{cc}(G), but their techniques do not even rule out an efficient (1+ε)(1+\varepsilon)-approximation algorithm for any constant ε>0\varepsilon>0. We show that for any constant c>0c > 0, it is Unique Games hard to approximate cc(G)\mathsf{cc}(G) to within a factor of cc. (See the paper for the full abstract.)Comment: 28 pages, updated figures and corrected typo

    Prediction based task scheduling in distributed computing

    Full text link

    Tree Buffers

    Get PDF
    In runtime verification, the central problem is to decide if a given program execution violates a given property. In online runtime verification, a monitor observes a program’s execution as it happens. If the program being observed has hard real-time constraints, then the monitor inherits them. In the presence of hard real-time constraints it becomes a challenge to maintain enough information to produce error traces, should a property violation be observed. In this paper we introduce a data structure, called tree buffer, that solves this problem in the context of automata-based monitors: If the monitor itself respects hard real-time constraints, then enriching it by tree buffers makes it possible to provide error traces, which are essential for diagnosing defects. We show that tree buffers are also useful in other application domains. For example, they can be used to implement functionality of capturing groups in regular expressions. We prove optimal asymptotic bounds for our data structure, and validate them using empirical data from two sources: regular expression searching through Wikipedia, and runtime verification of execution traces obtained from the DaCapo test suite

    Efficient data structures for range shortest unique substring queries†

    Get PDF
    Let T[1, n] be a string of length n and T[i, j] be the substring of T starting at position i and ending at position j. A substring T[i, j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [α, β], return a shortest substring T[i, j] of T with exactly one occurrence in [α, β]. We present an O(n log n)-word data structure with O(logw n) query time, where w = Ω(log n) is the word size. Our construction is based on a non-trivial reduction allowing for us to apply a recently introduced optimal geometric data structure [Chan et al., ICALP 2018]. Additionally, we present an O(n)-word data structure with O(√ n logɛ n) query time, where ɛ > 0 is an arbitrarily small constant. The latter data structure relies heavily on another geometric data structure [Nekrich and Navarro, SWAT 2012]
    corecore