1,125 research outputs found
Memory-Adjustable Navigation Piles with Applications to Sorting and Convex Hulls
We consider space-bounded computations on a random-access machine (RAM) where
the input is given on a read-only random-access medium, the output is to be
produced to a write-only sequential-access medium, and the available workspace
allows random reads and writes but is of limited capacity. The length of the
input is elements, the length of the output is limited by the computation,
and the capacity of the workspace is bits for some predetermined
parameter . We present a state-of-the-art priority queue---called an
adjustable navigation pile---for this restricted RAM model. Under some
reasonable assumptions, our priority queue supports and
in worst-case time and in worst-case time for any . We show how to use this
data structure to sort elements and to compute the convex hull of
points in the two-dimensional Euclidean space in
worst-case time for any . Following a known lower bound for the
space-time product of any branching program for finding unique elements, both
our sorting and convex-hull algorithms are optimal. The adjustable navigation
pile has turned out to be useful when designing other space-efficient
algorithms, and we expect that it will find its way to yet other applications.Comment: 21 page
Randomized Algorithms for Tracking Distributed Count, Frequencies, and Ranks
We show that randomization can lead to significant improvements for a few
fundamental problems in distributed tracking. Our basis is the {\em
count-tracking} problem, where there are players, each holding a counter
that gets incremented over time, and the goal is to track an
\eps-approximation of their sum continuously at all times,
using minimum communication. While the deterministic communication complexity
of the problem is \Theta(k/\eps \cdot \log N), where is the final value
of when the tracking finishes, we show that with randomization, the
communication cost can be reduced to \Theta(\sqrt{k}/\eps \cdot \log N). Our
algorithm is simple and uses only O(1) space at each player, while the lower
bound holds even assuming each player has infinite computing power. Then, we
extend our techniques to two related distributed tracking problems: {\em
frequency-tracking} and {\em rank-tracking}, and obtain similar improvements
over previous deterministic algorithms. Both problems are of central importance
in large data monitoring and analysis, and have been extensively studied in the
literature.Comment: 19 pages, 1 figur
Improved Algorithms for Time Decay Streams
In the time-decay model for data streams, elements of an underlying data set arrive sequentially with the recently arrived elements being more important. A common approach for handling large data sets is to maintain a coreset, a succinct summary of the processed data that allows approximate recovery of a predetermined query. We provide a general framework that takes any offline-coreset and gives a time-decay coreset for polynomial time decay functions.
We also consider the exponential time decay model for k-median clustering, where we provide a constant factor approximation algorithm that utilizes the online facility location algorithm. Our algorithm stores O(k log(h Delta)+h) points where h is the half-life of the decay function and Delta is the aspect ratio of the dataset. Our techniques extend to k-means clustering and M-estimators as well
- …