1,459 research outputs found
Managing Unbounded-Length Keys in Comparison-Driven Data Structures with Applications to On-Line Indexing
This paper presents a general technique for optimally transforming any
dynamic data structure that operates on atomic and indivisible keys by
constant-time comparisons, into a data structure that handles unbounded-length
keys whose comparison cost is not a constant. Examples of these keys are
strings, multi-dimensional points, multiple-precision numbers, multi-key data
(e.g.~records), XML paths, URL addresses, etc. The technique is more general
than what has been done in previous work as no particular exploitation of the
underlying structure of is required. The only requirement is that the insertion
of a key must identify its predecessor or its successor.
Using the proposed technique, online suffix tree can be constructed in worst
case time per input symbol (as opposed to amortized
time per symbol, achieved by previously known algorithms). To our knowledge,
our algorithm is the first that achieves worst case time per input
symbol. Searching for a pattern of length in the resulting suffix tree
takes time, where is the
number of occurrences of the pattern. The paper also describes more
applications and show how to obtain alternative methods for dealing with suffix
sorting, dynamic lowest common ancestors and order maintenance
Data structures
We discuss data structures and their methods of analysis. In particular, we treat the unweighted and weighted dictionary problem, self-organizing data structures, persistent data structures, the union-find-split problem, priority queues, the nearest common ancestor problem, the selection and merging problem, and dynamization techniques. The methods of analysis are worst, average and amortized case
Complexity of union-split-find problems
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 45-46).In this thesis, we investigate various interpretations of the Union-Split-Find problem, an extension of the classic Union-Find problem. In the Union-Split Find problem, we maintain disjoint sets of ordered elements subject to the operations of constructing singleton sets, merging two sets together, splitting a set by partitioning it around a specified value, and finding the set that contains a given element. The different interpretations of this problem arise from the different assumptions made regarding when sets can be merged and any special properties the sets may have. We define and analyze the Interval, Cyclic, Ordered, and General Union-Split-Find problems. Previous work implies optimal solutions to the Interval and Ordered Union-Split-Find problems and an (log n/ log log n) lower bound for the Cyclic Union-Split-Find problem in the cell-probe model. We present a new data structure that achieves a matching upper bound of (log n/ log log n) for Cyclic Union-Split Find in the word RAM model. For General Union-Split-Find, no o(n) bound is known. We present a data structure which has an [Omega](log2 n) amortized lower bound in the worst case that we conjecture has polylogarithmic amortized performance. This thesis is the product of joint work with Erik Demaine.by Katherine Jane Lai.M.Eng
JanusAQP: Efficient Partition Tree Maintenance for Dynamic Approximate Query Processing
Approximate query processing over dynamic databases, i.e., under
insertions/deletions, has applications ranging from high-frequency trading to
internet-of-things analytics. We present JanusAQP, a new dynamic AQP system,
which supports SUM, COUNT, AVG, MIN, and MAX queries under insertions and
deletions to the dataset. JanusAQP extends static partition tree synopses,
which are hierarchical aggregations of datasets, into the dynamic setting. This
paper contributes new methods for: (1) efficient initialization of the data
synopsis in the presence of incoming data, (2) maintenance of the data synopsis
under insertions/deletions, and (3) re-optimization of the partitioning to
reduce the approximation error. JanusAQP reduces the error of a
state-of-the-art baseline by more than 60% using only 10% storage cost.
JanusAQP can process more than 100K updates per second in a single node setting
and keep the query latency at a millisecond level
Algorithms for self-healing networks
Many modern networks are reconfigurable, in the sense that the topology of the network can be changed by the nodes in the network. For example, peer-to-peer, wireless and ad-hoc networks are reconfigurable. More generally, many social networks, such as a company\u27s organizational chart; infrastructure networks, such as an airline\u27s transportation network; and biological networks, such as the human brain, are also reconfigurable. Modern reconfigurable networks have a complexity unprecedented in the history of engineering, resembling more a dynamic and evolving living animal rather than a structure of steel designed from a blueprint. Unfortunately, our mathematical and algorithmic tools have not yet developed enough to handle this complexity and fully exploit the flexibility of these networks. We believe that it is no longer possible to build networks that are scalable and never have node failures. Instead, these networks should be able to admit small, and, maybe, periodic failures and still recover like skin heals from a cut. This process, where the network can recover itself by maintaining key invariants in response to attack by a powerful adversary is what we call self-healing. Here, we present several fast and provably good distributed algorithms for self-healing in reconfigurable dynamic networks. Each of these algorithms have different properties, a different set of gaurantees and limitations. We also discuss future directions and theoretical questions we would like to answer
- …