661 research outputs found

    Path and Ancestor Queries over Trees with Multidimensional Weight Vectors

    Get PDF
    We consider an ordinal tree T on n nodes, with each node assigned a d-dimensional weight vector w in {1,2,...,n}^d, where d in N is a constant. We study path queries as generalizations of well-known {orthogonal range queries}, with one of the dimensions being tree topology rather than a linear order. Since in our definitions d only represents the number of dimensions of the weight vector without taking the tree topology into account, a path query in a tree with d-dimensional weight vectors generalize the corresponding (d+1)-dimensional orthogonal range query. We solve {ancestor dominance reporting} problem as a direct generalization of dominance reporting problem, in time O(lg^{d-1}{n}+k) and space of O(n lg^{d-2}n) words, where k is the size of the output, for d >= 2. We also achieve a tradeoff of O(n lg^{d-2+epsilon}{n}) words of space, with query time of O((lg^{d-1} n)/(lg lg n)^{d-2}+k), for the same problem, when d >= 3. We solve {path successor problem} in O(n lg^{d-1}{n}) words of space and time O(lg^{d-1+epsilon}{n}) for d >= 1 and an arbitrary constant epsilon > 0. We propose a solution to {path counting problem}, with O(n(lg{n}/lg lg{n})^{d-1}) words of space and O((lg{n}/lg lg{n})^{d}) query time, for d >= 1. Finally, we solve {path reporting problem} in O(n lg^{d-1+epsilon}{n}) words of space and O((lg^{d-1}{n})/(lg lg{n})^{d-2}+k) query time, for d >= 2. These results match or nearly match the best tradeoffs of the respective range queries. We are also the first to solve path successor even for d = 1

    Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing

    Full text link
    This paper presents a general technique for optimally transforming any dynamic data structure that operates on atomic and indivisible keys by constant-time comparisons, into a data structure that handles unbounded-length keys whose comparison cost is not a constant. Examples of these keys are strings, multi-dimensional points, multiple-precision numbers, multi-key data (e.g.~records), XML paths, URL addresses, etc. The technique is more general than what has been done in previous work as no particular exploitation of the underlying structure of is required. The only requirement is that the insertion of a key must identify its predecessor or its successor. Using the proposed technique, online suffix tree can be constructed in worst case time O(logn)O(\log n) per input symbol (as opposed to amortized O(logn)O(\log n) time per symbol, achieved by previously known algorithms). To our knowledge, our algorithm is the first that achieves O(logn)O(\log n) worst case time per input symbol. Searching for a pattern of length mm in the resulting suffix tree takes O(min(mlogΣ,m+logn)+tocc)O(\min(m\log |\Sigma|, m + \log n) + tocc) time, where tocctocc is the number of occurrences of the pattern. The paper also describes more applications and show how to obtain alternative methods for dealing with suffix sorting, dynamic lowest common ancestors and order maintenance

    Data Structures for Categorical Path Counting Queries

    Get PDF

    Range Updates and Range Sum Queries on Multidimensional Points with Monoid Weights

    Get PDF
    Let P be a set of n points in ?^d where each point p ? P carries a weight drawn from a commutative monoid (?, +, 0). Given a d-rectangle r_upd (i.e., an orthogonal rectangle in ?^d) and a value ? ? ?, a range update adds ? to the weight of every point p ? P? r_upd; given a d-rectangle r_qry, a range sum query returns the total weight of the points in P ? r_qry. The goal is to store P in a structure to support updates and queries with attractive performance guarantees. We describe a structure of O?(n) space that handles an update in O?(T_upd) time and a query in O?(T_qry) time for arbitrary functions T_upd(n) and T_qry(n) satisfying T_upd ? T_qry = n. The result holds for any fixed dimensionality d ? 2. Our query-update tradeoff is tight up to a polylog factor subject to the OMv-conjecture

    On Differentially Private Counting on Trees

    Get PDF
    We study the problem of performing counting queries at different levels in hierarchical structures while preserving individuals\u27 privacy. Motivated by applications, we propose a new error measure for this problem by considering a combination of multiplicative and additive approximation to the query results. We examine known mechanisms in differential privacy (DP) and prove their optimality, under this measure, in the pure-DP setting. In the approximate-DP setting, we design new algorithms achieving significant improvements over known ones

    XML Matchers: approaches and challenges

    Full text link
    Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure
    corecore