    On Undecided LP, Clustering and Active Learning

    We study colored coverage and clustering problems. Here, we are given a colored point set, where the points are covered by k (unknown) clusters, which are monochromatic (i.e., all the points covered by the same cluster have the same color). The access to the colors of the points (or even the points themselves) is provided indirectly via various oracle queries (such as nearest neighbor, or separation queries). We show that one can correctly deduce the color of all the points (i.e., compute a monochromatic clustering of the points) using a polylogarithmic number of queries, if the number of clusters is a constant. We investigate several variants of this problem, including Undecided Linear Programming and covering of points by k monochromatic balls

    Approximating the Distribution of the Median and other Robust Estimators on Uncertain Data

    Robust estimators, like the median of a point set, are important for data analysis in the presence of outliers. We study robust estimators for locationally uncertain points with discrete distributions. That is, each point in a data set has a discrete probability distribution describing its location. The probabilistic nature of uncertain data makes it challenging to compute such estimators, since the true value of the estimator is now described by a distribution rather than a single point. We show how to construct and estimate the distribution of the median of a point set. Building the approximate support of the distribution takes near-linear time, and assigning probability to that support takes quadratic time. We also develop a general approximation technique for distributions of robust estimators with respect to ranges with bounded VC dimension. This includes the geometric median for high dimensions and the Siegel estimator for linear regression.Comment: Full version of a paper to appear at SoCG 201

    Active Learning a Convex Body in Low Dimensions

    Consider a set P⊆ℜdP \subseteq \Re^d of nn points, and a convex body CC provided via a separation oracle. The task at hand is to decide for each point of PP if it is in CC using the fewest number of oracle queries. We show that one can solve this problem in two and three dimensions using O(h(P)log⁡n)O( h(P) \log n) queries, where h(P)h(P) is the largest subset of points of PP in convex position. Furthermore, we show that in two dimensions one can solve this problem using O(v(P,C)log⁡2n)O( v(P,C) \log^2 n ) oracle queries, where v(P,C)v(P, C) is a lower bound on the minimum number of queries that any algorithm for this specific instance requires.Comment: Talk based on results in the paper is available here: https://youtu.be/5Epyh2lHrF

    No-Dimensional Tverberg Theorems and Algorithms

    Tverberg’s theorem states that for any k≄2 and any set P⊂Rd of at least (d+1)(k−1)+1 points in d dimensions, we can partition P into k subsets whose convex hulls have a non-empty intersection. The associated search problem of finding the partition lies in the complexity class CLS=PPAD∩PLS, but no hardness results are known. In the colorful Tverberg theorem, the points in P have colors, and under certain conditions, P can be partitioned into colorful sets, in which each color appears exactly once and whose convex hulls intersect. To date, the complexity of the associated search problem is unresolved. Recently, Adiprasito, BĂĄrĂĄny, and Mustafa (SODA 2019) gave a no-dimensional Tverberg theorem, in which the convex hulls may intersect in an approximate fashion. This relaxes the requirement on the cardinality of P. The argument is constructive, but does not result in a polynomial-time algorithm. We present a deterministic algorithm that finds for any n-point set P⊂Rd and any k∈{2,
,n} in O(nd⌈logk⌉) time a k-partition of P such that there is a ball of radius O((k/n−−√)diam(P)) that intersects the convex hull of each set. Given that this problem is not known to be solvable exactly in polynomial time, our result provides a remarkably efficient and simple new notion of approximation. Our main contribution is to generalize Sarkaria’s method (Israel Journal Math., 1992) to reduce the Tverberg problem to the colorful CarathĂ©odory problem (in the simplified tensor product interpretation of BĂĄrĂĄny and Onn) and to apply it algorithmically. It turns out that this not only leads to an alternative algorithmic proof of a no-dimensional Tverberg theorem, but it also generalizes to other settings such as the colorful variant of the problem

    Fast Algorithms for Geometric Consensuses

    Let P be a set of n points in ?^d in general position. A median hyperplane (roughly) splits the point set P in half. The yolk of P is the ball of smallest radius intersecting all median hyperplanes of P. The egg of P is the ball of smallest radius intersecting all hyperplanes which contain exactly d points of P. We present exact algorithms for computing the yolk and the egg of a point set, both running in expected time O(n^(d-1) log n). The running time of the new algorithm is a polynomial time improvement over existing algorithms. We also present algorithms for several related problems, such as computing the Tukey and center balls of a point set, among others

    Fast Fencing

    We consider very natural "fence enclosure" problems studied by Capoyleas, Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a set SS of nn points in the plane, we aim at finding a set of closed curves such that (1) each point is enclosed by a curve and (2) the total length of the curves is minimized. We consider two main variants. In the first variant, we pay a unit cost per curve in addition to the total length of the curves. An equivalent formulation of this version is that we have to enclose nn unit disks, paying only the total length of the enclosing curves. In the other variant, we are allowed to use at most kk closed curves and pay no cost per curve. For the variant with at most kk closed curves, we present an algorithm that is polynomial in both nn and kk. For the variant with unit cost per curve, or unit disks, we present a near-linear time algorithm. Capoyleas, Rote, and Woeginger solved the problem with at most kk curves in nO(k)n^{O(k)} time. Arkin, Khuller, and Mitchell used this to solve the unit cost per curve version in exponential time. At the time, they conjectured that the problem with kk curves is NP-hard for general kk. Our polynomial time algorithm refutes this unless P equals NP
