7 research outputs found
Efficient pebbling for list traversal synopses
We show how to support efficient back traversal in a unidirectional list,
using small memory and with essentially no slowdown in forward steps. Using
memory for a list of size , the 'th back-step from the
farthest point reached so far takes time in the worst case, while
the overhead per forward step is at most for arbitrary small
constant . An arbitrary sequence of forward and back steps is
allowed. A full trade-off between memory usage and time per back-step is
presented: vs. and vice versa. Our algorithms are based on a
novel pebbling technique which moves pebbles on a virtual binary, or -ary,
tree that can only be traversed in a pre-order fashion. The compact data
structures used by the pebbling algorithms, called list traversal synopses,
extend to general directed graphs, and have other interesting applications,
including memory efficient hash-chain implementation. Perhaps the most
surprising application is in showing that for any program, arbitrary rollback
steps can be efficiently supported with small overhead in memory, and marginal
overhead in its ordinary execution. More concretely: Let be a program that
runs for at most steps, using memory of size . Then, at the cost of
recording the input used by the program, and increasing the memory by a factor
of to , the program can be extended to support an
arbitrary sequence of forward execution and rollback steps: the 'th rollback
step takes time in the worst case, while forward steps take O(1)
time in the worst case, and amortized time per step.Comment: 27 page
Practice of streaming processing of dynamic graphs: concepts, models, and systems
http://arxiv.org/abs/1912.12740Published versio
Approximating Properties of Data Streams
In this dissertation, we present algorithms that approximate properties in the data stream model, where elements of an underlying data set arrive sequentially, but algorithms must use space sublinear in the size of the underlying data set. We first study the problem of finding all k-periods of a length-n string S, presented as a data stream. S is said to have k-period p if its prefix of length n − p differs from its suffix of length n − p in at most k locations. We give algorithms to compute the k-periods of a string S using poly(k, log n) bits of space and we complement these results with comparable lower bounds. We then study the problem of identifying a longest substring of strings S and T of length n that forms a d-near-alignment under the edit distance, in the simultaneous streaming model. In this model, symbols of strings S and T are streamed at the same time and form a d-near-alignment if the distance between them in some given metric is at most d. We give several algorithms, including an exact one-pass algorithm that uses O(d2 + d log n) bits of space. We then consider the distinct elements and `p-heavy hitters problems in the sliding window model, where only the most recent n elements in the data stream form the underlying set. We first introduce the composable histogram, a simple twist on the exponential (Datar et al., SODA 2002) and smooth histograms (Braverman and Ostrovsky, FOCS 2007) that may be of independent interest. We then show that the composable histogram along with a careful combination of existing techniques to track either the identity or frequency of a few specific items suffices to obtain algorithms for both distinct elements and `p-heavy hitters that is nearly optimal in both n and c. Finally, we consider the problem of estimating the maximum weighted matching of a graph whose edges are revealed in a streaming fashion. We develop a reduction from the maximum weighted matching problem to the maximum cardinality matching problem that only doubles the approximation factor of a streaming algorithm developed for the maximum cardinality matching problem. As an application, we obtain an estimator for the weight of a maximum weighted matching in bounded-arboricity graphs and in particular, a (48 + )-approximation estimator for the weight of a maximum weighted matching in planar graphs