58 research outputs found
Self-Improving Algorithms
We investigate ways in which an algorithm can improve its expected
performance by fine-tuning itself automatically with respect to an unknown
input distribution D. We assume here that D is of product type. More precisely,
suppose that we need to process a sequence I_1, I_2, ... of inputs I = (x_1,
x_2, ..., x_n) of some fixed length n, where each x_i is drawn independently
from some arbitrary, unknown distribution D_i. The goal is to design an
algorithm for these inputs so that eventually the expected running time will be
optimal for the input distribution D = D_1 * D_2 * ... * D_n.
We give such self-improving algorithms for two problems: (i) sorting a
sequence of numbers and (ii) computing the Delaunay triangulation of a planar
point set. Both algorithms achieve optimal expected limiting complexity. The
algorithms begin with a training phase during which they collect information
about the input distribution, followed by a stationary regime in which the
algorithms settle to their optimized incarnations.Comment: 26 pages, 8 figures, preliminary versions appeared at SODA 2006 and
SoCG 2008. Thorough revision to improve the presentation of the pape
Self-improving Algorithms for Coordinate-wise Maxima
Computing the coordinate-wise maxima of a planar point set is a classic and
well-studied problem in computational geometry. We give an algorithm for this
problem in the \emph{self-improving setting}. We have (unknown) independent
distributions \cD_1, \cD_2, ..., \cD_n of planar points. An input pointset
is generated by taking an independent sample from
each \cD_i, so the input distribution \cD is the product \prod_i \cD_i. A
self-improving algorithm repeatedly gets input sets from the distribution \cD
(which is \emph{a priori} unknown) and tries to optimize its running time for
\cD. Our algorithm uses the first few inputs to learn salient features of the
distribution, and then becomes an optimal algorithm for distribution \cD. Let
\OPT_\cD denote the expected depth of an \emph{optimal} linear comparison
tree computing the maxima for distribution \cD. Our algorithm eventually has
an expected running time of O(\text{OPT}_\cD + n), even though it did not
know \cD to begin with.
Our result requires new tools to understand linear comparison trees for
computing maxima. We show how to convert general linear comparison trees to
very restricted versions, which can then be related to the running time of our
algorithm. An interesting feature of our algorithm is an interleaved search,
where the algorithm tries to determine the likeliest point to be maximal with
minimal computation. This allows the running time to be truly optimal for the
distribution \cD.Comment: To appear in Symposium of Computational Geometry 2012 (17 pages, 2
figures
A Generalization of Self-Improving Algorithms
Ailon et al. [SICOMP'11] proposed self-improving algorithms for sorting and
Delaunay triangulation (DT) when the input instances follow
some unknown \emph{product distribution}. That is, comes from a fixed
unknown distribution , and the 's are drawn independently.
After spending time in a learning phase, the subsequent
expected running time is , where , and and are the
entropies of the distributions of the sorting and DT output, respectively. In
this paper, we allow dependence among the 's under the \emph{group product
distribution}. There is a hidden partition of into groups; the 's
in the -th group are fixed unknown functions of the same hidden variable
; and the 's are drawn from an unknown product distribution. We
describe self-improving algorithms for sorting and DT under this model when the
functions that map to 's are well-behaved. After an
-time training phase, we achieve and
expected running times for sorting and DT,
respectively, where is the inverse Ackermann function
Learning to Prune: Speeding up Repeated Computations
It is common to encounter situations where one must solve a sequence of
similar computational problems. Running a standard algorithm with worst-case
runtime guarantees on each instance will fail to take advantage of valuable
structure shared across the problem instances. For example, when a commuter
drives from work to home, there are typically only a handful of routes that
will ever be the shortest path. A naive algorithm that does not exploit this
common structure may spend most of its time checking roads that will never be
in the shortest path. More generally, we can often ignore large swaths of the
search space that will likely never contain an optimal solution.
We present an algorithm that learns to maximally prune the search space on
repeated computations, thereby reducing runtime while provably outputting the
correct solution each period with high probability. Our algorithm employs a
simple explore-exploit technique resembling those used in online algorithms,
though our setting is quite different. We prove that, with respect to our model
of pruning search spaces, our approach is optimal up to constant factors.
Finally, we illustrate the applicability of our model and algorithm to three
classic problems: shortest-path routing, string search, and linear programming.
We present experiments confirming that our simple algorithm is effective at
significantly reducing the runtime of solving repeated computations
A Static Optimality Transformation with Applications to Planar Point Location
Over the last decade, there have been several data structures that, given a
planar subdivision and a probability distribution over the plane, provide a way
for answering point location queries that is fine-tuned for the distribution.
All these methods suffer from the requirement that the query distribution must
be known in advance.
We present a new data structure for point location queries in planar
triangulations. Our structure is asymptotically as fast as the optimal
structures, but it requires no prior information about the queries. This is a
2D analogue of the jump from Knuth's optimum binary search trees (discovered in
1971) to the splay trees of Sleator and Tarjan in 1985. While the former need
to know the query distribution, the latter are statically optimal. This means
that we can adapt to the query sequence and achieve the same asymptotic
performance as an optimum static structure, without needing any additional
information.Comment: 13 pages, 1 figure, a preliminary version appeared at SoCG 201
Unions of Onions: Preprocessing Imprecise Points for Fast Onion Decomposition
Let be a set of pairwise disjoint unit disks in the plane.
We describe how to build a data structure for so that for any
point set containing exactly one point from each disk, we can quickly find
the onion decomposition (convex layers) of .
Our data structure can be built in time and has linear size.
Given , we can find its onion decomposition in time, where
is the number of layers. We also provide a matching lower bound. Our solution
is based on a recursive space decomposition, combined with a fast algorithm to
compute the union of two disjoint onionComment: 10 pages, 5 figures; a preliminary version appeared at WADS 201
- …