322 research outputs found
On Undecided LP, Clustering and Active Learning
We study colored coverage and clustering problems. Here, we are given a colored point set, where the points are covered by k (unknown) clusters, which are monochromatic (i.e., all the points covered by the same cluster have the same color). The access to the colors of the points (or even the points themselves) is provided indirectly via various oracle queries (such as nearest neighbor, or separation queries). We show that one can correctly deduce the color of all the points (i.e., compute a monochromatic clustering of the points) using a polylogarithmic number of queries, if the number of clusters is a constant.
We investigate several variants of this problem, including Undecided Linear Programming and covering of points by k monochromatic balls
Approximating the Distribution of the Median and other Robust Estimators on Uncertain Data
Robust estimators, like the median of a point set, are important for data
analysis in the presence of outliers. We study robust estimators for
locationally uncertain points with discrete distributions. That is, each point
in a data set has a discrete probability distribution describing its location.
The probabilistic nature of uncertain data makes it challenging to compute such
estimators, since the true value of the estimator is now described by a
distribution rather than a single point. We show how to construct and estimate
the distribution of the median of a point set. Building the approximate support
of the distribution takes near-linear time, and assigning probability to that
support takes quadratic time. We also develop a general approximation technique
for distributions of robust estimators with respect to ranges with bounded VC
dimension. This includes the geometric median for high dimensions and the
Siegel estimator for linear regression.Comment: Full version of a paper to appear at SoCG 201
Active Learning a Convex Body in Low Dimensions
Consider a set of points, and a convex body
provided via a separation oracle. The task at hand is to decide for each point
of if it is in using the fewest number of oracle queries. We show that
one can solve this problem in two and three dimensions using
queries, where is the largest subset of points of in convex
position. Furthermore, we show that in two dimensions one can solve this
problem using oracle queries, where is a lower
bound on the minimum number of queries that any algorithm for this specific
instance requires.Comment: Talk based on results in the paper is available here:
https://youtu.be/5Epyh2lHrF
No-Dimensional Tverberg Theorems and Algorithms
Tverbergâs theorem states that for any kâ„2 and any set PâRd of at least (d+1)(kâ1)+1 points in d dimensions, we can partition P into k subsets whose convex hulls have a non-empty intersection. The associated search problem of finding the partition lies in the complexity class CLS=PPADâ©PLS, but no hardness results are known. In the colorful Tverberg theorem, the points in P have colors, and under certain conditions, P can be partitioned into colorful sets, in which each color appears exactly once and whose convex hulls intersect. To date, the complexity of the associated search problem is unresolved. Recently, Adiprasito, BĂĄrĂĄny, and Mustafa (SODA 2019) gave a no-dimensional Tverberg theorem, in which the convex hulls may intersect in an approximate fashion. This relaxes the requirement on the cardinality of P. The argument is constructive, but does not result in a polynomial-time algorithm. We present a deterministic algorithm that finds for any n-point set PâRd and any kâ{2,âŠ,n} in O(ndâlogkâ) time a k-partition of P such that there is a ball of radius O((k/nâââ)diam(P)) that intersects the convex hull of each set. Given that this problem is not known to be solvable exactly in polynomial time, our result provides a remarkably efficient and simple new notion of approximation. Our main contribution is to generalize Sarkariaâs method (Israel Journal Math., 1992) to reduce the Tverberg problem to the colorful CarathĂ©odory problem (in the simplified tensor product interpretation of BĂĄrĂĄny and Onn) and to apply it algorithmically. It turns out that this not only leads to an alternative algorithmic proof of a no-dimensional Tverberg theorem, but it also generalizes to other settings such as the colorful variant of the problem
Fast Algorithms for Geometric Consensuses
Let P be a set of n points in ?^d in general position. A median hyperplane (roughly) splits the point set P in half. The yolk of P is the ball of smallest radius intersecting all median hyperplanes of P. The egg of P is the ball of smallest radius intersecting all hyperplanes which contain exactly d points of P.
We present exact algorithms for computing the yolk and the egg of a point set, both running in expected time O(n^(d-1) log n). The running time of the new algorithm is a polynomial time improvement over existing algorithms. We also present algorithms for several related problems, such as computing the Tukey and center balls of a point set, among others
Fast Fencing
We consider very natural "fence enclosure" problems studied by Capoyleas,
Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a
set of points in the plane, we aim at finding a set of closed curves
such that (1) each point is enclosed by a curve and (2) the total length of the
curves is minimized. We consider two main variants. In the first variant, we
pay a unit cost per curve in addition to the total length of the curves. An
equivalent formulation of this version is that we have to enclose unit
disks, paying only the total length of the enclosing curves. In the other
variant, we are allowed to use at most closed curves and pay no cost per
curve.
For the variant with at most closed curves, we present an algorithm that
is polynomial in both and . For the variant with unit cost per curve, or
unit disks, we present a near-linear time algorithm.
Capoyleas, Rote, and Woeginger solved the problem with at most curves in
time. Arkin, Khuller, and Mitchell used this to solve the unit cost
per curve version in exponential time. At the time, they conjectured that the
problem with curves is NP-hard for general . Our polynomial time
algorithm refutes this unless P equals NP
- âŠ