901 research outputs found

    Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

    Get PDF
    We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques

    \v{C}ech-Delaunay gradient flow and homology inference for self-maps

    Full text link
    We call a continuous self-map that reveals itself through a discrete set of point-value pairs a sampled dynamical system. Capturing the available information with chain maps on Delaunay complexes, we use persistent homology to quantify the evidence of recurrent behavior. We establish a sampling theorem to recover the eigenspace of the endomorphism on homology induced by the self-map. Using a combinatorial gradient flow arising from the discrete Morse theory for \v{C}ech and Delaunay complexes, we construct a chain map to transform the problem from the natural but expensive \v{C}ech complexes to the computationally efficient Delaunay triangulations. The fast chain map algorithm has applications beyond dynamical systems.Comment: 22 pages, 8 figure

    A Sub-Linear Time Framework for Geometric Optimization with Outliers in High Dimensions

    Get PDF
    Many real-world problems can be formulated as geometric optimization problems in high dimensions, especially in the fields of machine learning and data mining. Moreover, we often need to take into account of outliers when optimizing the objective functions. However, the presence of outliers could make the problems to be much more challenging than their vanilla versions. In this paper, we study the fundamental minimum enclosing ball (MEB) with outliers problem first; partly inspired by the core-set method from B?doiu and Clarkson, we propose a sub-linear time bi-criteria approximation algorithm based on two novel techniques, the Uniform-Adaptive Sampling method and Sandwich Lemma. To the best of our knowledge, our result is the first sub-linear time algorithm, which has the sample size (i.e., the number of sampled points) independent of both the number of input points n and dimensionality d, for MEB with outliers in high dimensions. Furthermore, we observe that these two techniques can be generalized to deal with a broader range of geometric optimization problems with outliers in high dimensions, including flat fitting, k-center clustering, and SVM with outliers, and therefore achieve the sub-linear time algorithms for these problems respectively
    • …
    corecore