11,739 research outputs found
Some results on tries with adaptive branching
AbstractWe study a modification of digital trees (or tries) with adaptive multi-digit branching. Such tries can dynamically adjust degrees of their nodes by choosing the number of digits to be processed per lookup. While we do not specify any particular method for selecting the degrees of nodes, we assume that such selection can be accomplished by examining the number of strings remaining in each sub-tree, and/or estimating parameters of the input distribution. We call this class of digital trees adaptive multi-digit tries (or AMD-tries) and provide a preliminary analysis of their expected behavior in a memoryless model. We establish the following results: (1) there exist AMD-tries attaining a constant expected time of a successful search; (2) there exist AMD-tries consuming a linear (in the number of strings inserted) amount of space; (3) both constant search time and linear space usage can be attained if the (memoryless) source is symmetric. We accompany our analysis with a brief survey of several known types of adaptive trie structures, and show how our analysis extends (and/or complements) previous results
Partial fillup and search time in LC tries
Andersson and Nilsson introduced in 1993 a level-compressed trie (in short:
LC trie) in which a full subtree of a node is compressed to a single node of
degree being the size of the subtree. Recent experimental results indicated a
'dramatic improvement' when full subtrees are replaced by partially filled
subtrees. In this paper, we provide a theoretical justification of these
experimental results showing, among others, a rather moderate improvement of
the search time over the original LC tries. For such an analysis, we assume
that n strings are generated independently by a binary memoryless source with p
denoting the probability of emitting a 1. We first prove that the so called
alpha-fillup level (i.e., the largest level in a trie with alpha fraction of
nodes present at this level) is concentrated on two values with high
probability. We give these values explicitly up to O(1), and observe that the
value of alpha (strictly between 0 and 1) does not affect the leading term.
This result directly yields the typical depth (search time) in the alpha-LC
tries with p not equal to 1/2, which turns out to be C loglog n for an
explicitly given constant C (depending on p but not on alpha). This should be
compared with recently found typical depth in the original LC tries which is C'
loglog n for a larger constant C'. The search time in alpha-LC tries is thus
smaller but of the same order as in the original LC tries.Comment: 13 page
Fast Approximate Reconciliation of Set Differences
We present new, simple, efficient data structures for approximate reconciliation of set differences, a useful standalone primitive for peer-to-peer networks and a natural subroutine in methods for exact reconciliation. In the approximate reconciliation problem, peers A and B respectively have subsets of elements SA and SB of a large universe U. Peer A wishes to send a short message M to peer B with the goal that B should use M to determine as many elements in the set SB–SA as possible. To avoid the expense of round trip communication times, we focus on the situation where a single message M is sent.
We motivate the performance tradeoffs between message size, accuracy and computation time for this problem with a straightforward approach using Bloom filters. We then introduce approximation reconciliation trees, a more computationally efficient solution that combines techniques from Patricia tries, Merkle trees, and Bloom filters. We present an analysis of approximation reconciliation trees and provide experimental results comparing the various methods proposed for approximate reconciliation.National Science Foundation (ANI-0093296, ANI-9986397, CCR-0118701, CCR-0121154); Alfred P. Sloan Research Fellowshi
Symmetry-Based Search Space Reduction For Grid Maps
In this paper we explore a symmetry-based search space reduction technique
which can speed up optimal pathfinding on undirected uniform-cost grid maps by
up to 38 times. Our technique decomposes grid maps into a set of empty
rectangles, removing from each rectangle all interior nodes and possibly some
from along the perimeter. We then add a series of macro-edges between selected
pairs of remaining perimeter nodes to facilitate provably optimal traversal
through each rectangle. We also develop a novel online pruning technique to
further speed up search. Our algorithm is fast, memory efficient and retains
the same optimality and completeness guarantees as searching on an unmodified
grid map
BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures
We introduce BriskStream, an in-memory data stream processing system (DSPSs)
specifically designed for modern shared-memory multicore architectures.
BriskStream's key contribution is an execution plan optimization paradigm,
namely RLAS, which takes relative-location (i.e., NUMA distance) of each pair
of producer-consumer operators into consideration. We propose a branch and
bound based approach with three heuristics to resolve the resulting nontrivial
optimization problem. The experimental evaluations demonstrate that BriskStream
yields much higher throughput and better scalability than existing DSPSs on
multi-core architectures when processing different types of workloads.Comment: To appear in SIGMOD'1
- …