Search CORE

710 research outputs found

Self-improving Algorithms for Coordinate-wise Maxima

Author: Clarkson Kenneth L.
Mulzer Wolfgang
Seshadhri C.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2012
Field of study

Computing the coordinate-wise maxima of a planar point set is a classic and well-studied problem in computational geometry. We give an algorithm for this problem in the \emph{self-improving setting}. We have

n

(unknown) independent distributions \cD_1, \cD_2, ..., \cD_n of planar points. An input pointset

(p_1, p_2, ..., p_n)

is generated by taking an independent sample

p_i

from each \cD_i, so the input distribution \cD is the product \prod_i \cD_i. A self-improving algorithm repeatedly gets input sets from the distribution \cD (which is \emph{a priori} unknown) and tries to optimize its running time for \cD. Our algorithm uses the first few inputs to learn salient features of the distribution, and then becomes an optimal algorithm for distribution \cD. Let \OPT_\cD denote the expected depth of an \emph{optimal} linear comparison tree computing the maxima for distribution \cD. Our algorithm eventually has an expected running time of O(\text{OPT}_\cD + n), even though it did not know \cD to begin with. Our result requires new tools to understand linear comparison trees for computing maxima. We show how to convert general linear comparison trees to very restricted versions, which can then be related to the running time of our algorithm. An interesting feature of our algorithm is an interleaved search, where the algorithm tries to determine the likeliest point to be maximal with minimal computation. This allows the running time to be truly optimal for the distribution \cD.Comment: To appear in Symposium of Computational Geometry 2012 (17 pages, 2 figures

arXiv.org e-Print Archive

CiteSeerX

Crossref

Minimum Coresets for Maxima Representation of Multidimensional Data

Author: Li Yuchen
Mathioudakis Michael
Tan Kian-Lee
Wang Yanhao
Publication venue: ACM
Publication date: 01/01/2021
Field of study

Coresets are succinct summaries of large datasets such that, for a given problem, the solution obtained from a coreset is provably competitive with the solution obtained from the full dataset. As such, coreset-based data summarization techniques have been successfully applied to various problems, e.g., geometric optimization, clustering, and approximate query processing, for scaling them up to massive data. In this paper, we study coresets for the maxima representation of multidimensional data: Given a set P of points in R^d , where d is a small constant, and an error parameter ε ∈ (0, 1), a subset Q ⊆ P is an ε-coreset for the maxima representation of P iff the maximum of Q is an ε-approximation of the maximum of P for any vector u ∈ R^d , where the maximum is taken over the inner products between the set of points (P or Q) and u. We define a novel minimum ε-coreset problem that asks for an ε-coreset of the smallest size for the maxima representation of a point set. For the two-dimensional case, we develop an optimal polynomial-time algorithm for the minimum ε-coreset problem by transforming it into the shortest-cycle problem in a directed graph. Then, we prove that this problem is NP-hard in three or higher dimensions and present polynomial-time approximation algorithms in an arbitrary fixed dimension. Finally, we provide extensive experimental results on both real and synthetic datasets to demonstrate the superior performance of our proposed algorithms.Peer reviewe

Institutional Knowledge at Singapore Management University

Helsingin yliopiston digitaalinen arkisto

A simple and efficient preprocessing step for convex hull problem

Author: Heydari Mohammad
Khalifeh Ashkan
Publication venue
Publication date: 09/04/2023
Field of study

The present paper is concerned with a recursive algorithm as a preprocessing step to find the convex hull of

n

random points uniformly distributed in the plane. For such a set of points, it is shown that eliminating all but

O(\log n)

of points can derive the same convex hull as the input set. Finally it will be shown that the running time of the algorithm is $O(n

arXiv.org e-Print Archive

Online Multivariate Changepoint Detection: Leveraging Links With Computational Geometry

Author: Fearnhead Paul
Pishchagina Liudmila
Rigaill Guillem
Romano Gaetano
Runge Vincent
Publication venue
Publication date: 02/11/2023
Field of study

The increasing volume of data streams poses significant computational challenges for detecting changepoints online. Likelihood-based methods are effective, but their straightforward implementation becomes impractical online. We develop two online algorithms that exactly calculate the likelihood ratio test for a single changepoint in p-dimensional data streams by leveraging fascinating connections with computational geometry. Our first algorithm is straightforward and empirically quasi-linear. The second is more complex but provably quasi-linear:

\mathcal{O}(n\log(n)^{p+1})

for

n

data points. Through simulations, we illustrate, that they are fast and allow us to process millions of points within a matter of minutes up to

p=5

.Comment: 31 pages,15 figure

arXiv.org e-Print Archive