58 research outputs found

    Self-Improving Algorithms

    Full text link
    We investigate ways in which an algorithm can improve its expected performance by fine-tuning itself automatically with respect to an unknown input distribution D. We assume here that D is of product type. More precisely, suppose that we need to process a sequence I_1, I_2, ... of inputs I = (x_1, x_2, ..., x_n) of some fixed length n, where each x_i is drawn independently from some arbitrary, unknown distribution D_i. The goal is to design an algorithm for these inputs so that eventually the expected running time will be optimal for the input distribution D = D_1 * D_2 * ... * D_n. We give such self-improving algorithms for two problems: (i) sorting a sequence of numbers and (ii) computing the Delaunay triangulation of a planar point set. Both algorithms achieve optimal expected limiting complexity. The algorithms begin with a training phase during which they collect information about the input distribution, followed by a stationary regime in which the algorithms settle to their optimized incarnations.Comment: 26 pages, 8 figures, preliminary versions appeared at SODA 2006 and SoCG 2008. Thorough revision to improve the presentation of the pape

    Self-improving Algorithms for Coordinate-wise Maxima

    Full text link
    Computing the coordinate-wise maxima of a planar point set is a classic and well-studied problem in computational geometry. We give an algorithm for this problem in the \emph{self-improving setting}. We have nn (unknown) independent distributions \cD_1, \cD_2, ..., \cD_n of planar points. An input pointset (p1,p2,...,pn)(p_1, p_2, ..., p_n) is generated by taking an independent sample pip_i from each \cD_i, so the input distribution \cD is the product \prod_i \cD_i. A self-improving algorithm repeatedly gets input sets from the distribution \cD (which is \emph{a priori} unknown) and tries to optimize its running time for \cD. Our algorithm uses the first few inputs to learn salient features of the distribution, and then becomes an optimal algorithm for distribution \cD. Let \OPT_\cD denote the expected depth of an \emph{optimal} linear comparison tree computing the maxima for distribution \cD. Our algorithm eventually has an expected running time of O(\text{OPT}_\cD + n), even though it did not know \cD to begin with. Our result requires new tools to understand linear comparison trees for computing maxima. We show how to convert general linear comparison trees to very restricted versions, which can then be related to the running time of our algorithm. An interesting feature of our algorithm is an interleaved search, where the algorithm tries to determine the likeliest point to be maximal with minimal computation. This allows the running time to be truly optimal for the distribution \cD.Comment: To appear in Symposium of Computational Geometry 2012 (17 pages, 2 figures

    A Generalization of Self-Improving Algorithms

    Get PDF
    Ailon et al. [SICOMP'11] proposed self-improving algorithms for sorting and Delaunay triangulation (DT) when the input instances x1,,xnx_1,\cdots,x_n follow some unknown \emph{product distribution}. That is, xix_i comes from a fixed unknown distribution Di\mathsf{D}_i, and the xix_i's are drawn independently. After spending O(n1+ε)O(n^{1+\varepsilon}) time in a learning phase, the subsequent expected running time is O((n+H)/ε)O((n+ H)/\varepsilon), where H{HS,HDT}H \in \{H_\mathrm{S},H_\mathrm{DT}\}, and HSH_\mathrm{S} and HDTH_\mathrm{DT} are the entropies of the distributions of the sorting and DT output, respectively. In this paper, we allow dependence among the xix_i's under the \emph{group product distribution}. There is a hidden partition of [1,n][1,n] into groups; the xix_i's in the kk-th group are fixed unknown functions of the same hidden variable uku_k; and the uku_k's are drawn from an unknown product distribution. We describe self-improving algorithms for sorting and DT under this model when the functions that map uku_k to xix_i's are well-behaved. After an O(poly(n))O(\mathrm{poly}(n))-time training phase, we achieve O(n+HS)O(n + H_\mathrm{S}) and O(nα(n)+HDT)O(n\alpha(n) + H_\mathrm{DT}) expected running times for sorting and DT, respectively, where α()\alpha(\cdot) is the inverse Ackermann function

    Self-improving Algorithms for Convex Hulls

    Full text link

    Learning to Prune: Speeding up Repeated Computations

    Get PDF
    It is common to encounter situations where one must solve a sequence of similar computational problems. Running a standard algorithm with worst-case runtime guarantees on each instance will fail to take advantage of valuable structure shared across the problem instances. For example, when a commuter drives from work to home, there are typically only a handful of routes that will ever be the shortest path. A naive algorithm that does not exploit this common structure may spend most of its time checking roads that will never be in the shortest path. More generally, we can often ignore large swaths of the search space that will likely never contain an optimal solution. We present an algorithm that learns to maximally prune the search space on repeated computations, thereby reducing runtime while provably outputting the correct solution each period with high probability. Our algorithm employs a simple explore-exploit technique resembling those used in online algorithms, though our setting is quite different. We prove that, with respect to our model of pruning search spaces, our approach is optimal up to constant factors. Finally, we illustrate the applicability of our model and algorithm to three classic problems: shortest-path routing, string search, and linear programming. We present experiments confirming that our simple algorithm is effective at significantly reducing the runtime of solving repeated computations

    A Static Optimality Transformation with Applications to Planar Point Location

    Full text link
    Over the last decade, there have been several data structures that, given a planar subdivision and a probability distribution over the plane, provide a way for answering point location queries that is fine-tuned for the distribution. All these methods suffer from the requirement that the query distribution must be known in advance. We present a new data structure for point location queries in planar triangulations. Our structure is asymptotically as fast as the optimal structures, but it requires no prior information about the queries. This is a 2D analogue of the jump from Knuth's optimum binary search trees (discovered in 1971) to the splay trees of Sleator and Tarjan in 1985. While the former need to know the query distribution, the latter are statically optimal. This means that we can adapt to the query sequence and achieve the same asymptotic performance as an optimum static structure, without needing any additional information.Comment: 13 pages, 1 figure, a preliminary version appeared at SoCG 201

    Unions of Onions: Preprocessing Imprecise Points for Fast Onion Decomposition

    Full text link
    Let D\mathcal{D} be a set of nn pairwise disjoint unit disks in the plane. We describe how to build a data structure for D\mathcal{D} so that for any point set PP containing exactly one point from each disk, we can quickly find the onion decomposition (convex layers) of PP. Our data structure can be built in O(nlogn)O(n \log n) time and has linear size. Given PP, we can find its onion decomposition in O(nlogk)O(n \log k) time, where kk is the number of layers. We also provide a matching lower bound. Our solution is based on a recursive space decomposition, combined with a fast algorithm to compute the union of two disjoint onionComment: 10 pages, 5 figures; a preliminary version appeared at WADS 201
    corecore