291 research outputs found
Hardness of Approximate Nearest Neighbor Search
We prove conditional near-quadratic running time lower bounds for approximate
Bichromatic Closest Pair with Euclidean, Manhattan, Hamming, or edit distance.
Specifically, unless the Strong Exponential Time Hypothesis (SETH) is false,
for every there exists a constant such that computing a
-approximation to the Bichromatic Closest Pair requires
time. In particular, this implies a near-linear query time for
Approximate Nearest Neighbor search with polynomial preprocessing time.
Our reduction uses the Distributed PCP framework of [ARW'17], but obtains
improved efficiency using Algebraic Geometry (AG) codes. Efficient PCPs from AG
codes have been constructed in other settings before [BKKMS'16, BCGRS'17], but
our construction is the first to yield new hardness results
Distributed PCP Theorems for Hardness of Approximation in P
We present a new distributed model of probabilistically checkable proofs
(PCP). A satisfying assignment to a CNF formula is
shared between two parties, where Alice knows , Bob knows
, and both parties know . The goal is to have
Alice and Bob jointly write a PCP that satisfies , while
exchanging little or no information. Unfortunately, this model as-is does not
allow for nontrivial query complexity. Instead, we focus on a non-deterministic
variant, where the players are helped by Merlin, a third party who knows all of
.
Using our framework, we obtain, for the first time, PCP-like reductions from
the Strong Exponential Time Hypothesis (SETH) to approximation problems in P.
In particular, under SETH we show that there are no truly-subquadratic
approximation algorithms for Bichromatic Maximum Inner Product over
{0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate
Regular Expression Matching, and Diameter in Product Metric. All our
inapproximability factors are nearly-tight. In particular, for the first two
problems we obtain nearly-polynomial factors of ; only
-factor lower bounds (under SETH) were known before
Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms
We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques
Connectivity and equilibrium in random games
We study how the structure of the interaction graph of a game affects the
existence of pure Nash equilibria. In particular, for a fixed interaction
graph, we are interested in whether there are pure Nash equilibria arising when
random utility tables are assigned to the players. We provide conditions for
the structure of the graph under which equilibria are likely to exist and
complementary conditions which make the existence of equilibria highly
unlikely. Our results have immediate implications for many deterministic graphs
and generalize known results for random games on the complete graph. In
particular, our results imply that the probability that bounded degree graphs
have pure Nash equilibria is exponentially small in the size of the graph and
yield a simple algorithm that finds small nonexistence certificates for a large
family of graphs. Then we show that in any strongly connected graph of n
vertices with expansion the distribution of the number
of equilibria approaches the Poisson distribution with parameter 1,
asymptotically as .Comment: Published in at http://dx.doi.org/10.1214/10-AAP715 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On Generalization Bounds for Projective Clustering
Given a set of points, clustering consists of finding a partition of a point
set into clusters such that the center to which a point is assigned is as
close as possible. Most commonly, centers are points themselves, which leads to
the famous -median and -means objectives. One may also choose centers to
be dimensional subspaces, which gives rise to subspace clustering. In this
paper, we consider learning bounds for these problems. That is, given a set of
samples drawn independently from some unknown, but fixed distribution
, how quickly does a solution computed on converge to the
optimal clustering of ? We give several near optimal results. In
particular,
For center-based objectives, we show a convergence rate of
. This matches the known optimal bounds
of [Fefferman, Mitter, and Narayanan, Journal of the Mathematical Society 2016]
and [Bartlett, Linder, and Lugosi, IEEE Trans. Inf. Theory 1998] for -means
and extends it to other important objectives such as -median.
For subspace clustering with -dimensional subspaces, we show a convergence
rate of . These are the first
provable bounds for most of these problems. For the specific case of projective
clustering, which generalizes -means, we show a convergence rate of
is necessary, thereby proving that the
bounds from [Fefferman, Mitter, and Narayanan, Journal of the Mathematical
Society 2016] are essentially optimal
Light Spanners for High Dimensional Norms via Stochastic Decompositions
Spanners for low dimensional spaces (e.g. Euclidean space of constant dimension, or doubling metrics) are well understood. This lies in contrast to the situation in high dimensional spaces, where except for the work of Har-Peled, Indyk and Sidiropoulos (SODA 2013), who showed that any n-point Euclidean metric has an O(t)-spanner with O~(n^{1+1/t^2}) edges, little is known.
In this paper we study several aspects of spanners in high dimensional normed spaces. First, we build spanners for finite subsets of l_p with 1<p <=2. Second, our construction yields a spanner which is both sparse and also light, i.e., its total weight is not much larger than that of the minimum spanning tree. In particular, we show that any n-point subset of l_p for 1<p <=2 has an O(t)-spanner with n^{1+O~(1/t^p)} edges and lightness n^{O~(1/t^p)}.
In fact, our results are more general, and they apply to any metric space admitting a certain low diameter stochastic decomposition. It is known that arbitrary metric spaces have an O(t)-spanner with lightness O(n^{1/t}). We exhibit the following tradeoff: metrics with decomposability parameter nu=nu(t) admit an O(t)-spanner with lightness O~(nu^{1/t}). For example, n-point Euclidean metrics have nu <=n^{1/t}, metrics with doubling constant lambda have nu <=lambda, and graphs of genus g have nu <=g. While these families do admit a (1+epsilon)-spanner, its lightness depend exponentially on the dimension (resp. log g). Our construction alleviates this exponential dependency, at the cost of incurring larger stretch
- …