20,095 research outputs found

    Probabilistic Polynomials and Hamming Nearest Neighbors

    Full text link
    We show how to compute any symmetric Boolean function on nn variables over any field (as well as the integers) with a probabilistic polynomial of degree O(nlog(1/ϵ))O(\sqrt{n \log(1/\epsilon)}) and error at most ϵ\epsilon. The degree dependence on nn and ϵ\epsilon is optimal, matching a lower bound of Razborov (1987) and Smolensky (1987) for the MAJORITY function. The proof is constructive: a low-degree polynomial can be efficiently sampled from the distribution. This polynomial construction is combined with other algebraic ideas to give the first subquadratic time algorithm for computing a (worst-case) batch of Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let c(n):NNc(n) : \mathbb{N} \rightarrow \mathbb{N}. Suppose we are given a database DD of nn vectors in {0,1}c(n)logn\{0,1\}^{c(n) \log n} and a collection of nn query vectors QQ in the same dimension. For all uQu \in Q, we wish to compute a vDv \in D with minimum Hamming distance from uu. We solve this problem in n21/O(c(n)log2c(n))n^{2-1/O(c(n) \log^2 c(n))} randomized time. Hence, the problem is in "truly subquadratic" time for O(logn)O(\log n) dimensions, and in subquadratic time for d=o((log2n)/(loglogn)2)d = o((\log^2 n)/(\log \log n)^2). We apply the algorithm to computing pairs with maximum inner product, closest pair in 1\ell_1 for vectors with bounded integer entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

    Closest pair optimization on modern hardware

    Get PDF
    Master's Project (M.S.) University of Alaska Fairbanks, 2019In this project we examine the performance of several algorithms for finding the closest pair of points out of a given set of points in a plane. We look at four algorithms, including brute force, recursive, non-recursive, and a random expected linear time for numbers of points ranging from one hundred to one billion. In our examination, we find that on average the non-recursive is the fastest, except for limited cases of 100 points for the brute force, and 32 bit spaces for the random expected linear

    Dominance Product and High-Dimensional Closest Pair under LL_\infty

    Get PDF
    Given a set SS of nn points in Rd\mathbb{R}^d, the Closest Pair problem is to find a pair of distinct points in SS at minimum distance. When dd is constant, there are efficient algorithms that solve this problem, and fast approximate solutions for general dd. However, obtaining an exact solution in very high dimensions seems to be much less understood. We consider the high-dimensional LL_\infty Closest Pair problem, where d=nrd=n^r for some r>0r > 0, and the underlying metric is LL_\infty. We improve and simplify previous results for LL_\infty Closest Pair, showing that it can be solved by a deterministic strongly-polynomial algorithm that runs in O(DP(n,d)logn)O(DP(n,d)\log n) time, and by a randomized algorithm that runs in O(DP(n,d))O(DP(n,d)) expected time, where DP(n,d)DP(n,d) is the time bound for computing the {\em dominance product} for nn points in Rd\mathbb{R}^d. That is a matrix DD, such that D[i,j]={kpi[k]pj[k]}D[i,j] = \bigl| \{k \mid p_i[k] \leq p_j[k]\} \bigr|; this is the number of coordinates at which pjp_j dominates pip_i. For integer coordinates from some interval [M,M][-M, M], we obtain an algorithm that runs in O~(min{Mnω(1,r,1),DP(n,d)})\tilde{O}\left(\min\{Mn^{\omega(1,r,1)},\, DP(n,d)\}\right) time, where ω(1,r,1)\omega(1,r,1) is the exponent of multiplying an n×nrn \times n^r matrix by an nr×nn^r \times n matrix. We also give slightly better bounds for DP(n,d)DP(n,d), by using more recent rectangular matrix multiplication bounds. Computing the dominance product itself is an important task, since it is applied in many algorithms as a major black-box ingredient, such as algorithms for APBP (all pairs bottleneck paths), and variants of APSP (all pairs shortest paths)

    Distributed PCP Theorems for Hardness of Approximation in P

    Get PDF
    We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment x{0,1}nx \in \{0,1\}^n to a CNF formula φ\varphi is shared between two parties, where Alice knows x1,,xn/2x_1, \dots, x_{n/2}, Bob knows xn/2+1,,xnx_{n/2+1},\dots,x_n, and both parties know φ\varphi. The goal is to have Alice and Bob jointly write a PCP that xx satisfies φ\varphi, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of xx. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no truly-subquadratic approximation algorithms for Bichromatic Maximum Inner Product over {0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first two problems we obtain nearly-polynomial factors of 2(logn)1o(1)2^{(\log n)^{1-o(1)}}; only (1+o(1))(1+o(1))-factor lower bounds (under SETH) were known before

    Geographic Gossip: Efficient Averaging for Sensor Networks

    Full text link
    Gossip algorithms for distributed computation are attractive due to their simplicity, distributed nature, and robustness in noisy and uncertain environments. However, using standard gossip algorithms can lead to a significant waste in energy by repeatedly recirculating redundant information. For realistic sensor network model topologies like grids and random geometric graphs, the inefficiency of gossip schemes is related to the slow mixing times of random walks on the communication graph. We propose and analyze an alternative gossiping scheme that exploits geographic information. By utilizing geographic routing combined with a simple resampling method, we demonstrate substantial gains over previously proposed gossip protocols. For regular graphs such as the ring or grid, our algorithm improves standard gossip by factors of nn and n\sqrt{n} respectively. For the more challenging case of random geometric graphs, our algorithm computes the true average to accuracy ϵ\epsilon using O(n1.5lognlogϵ1)O(\frac{n^{1.5}}{\sqrt{\log n}} \log \epsilon^{-1}) radio transmissions, which yields a nlogn\sqrt{\frac{n}{\log n}} factor improvement over standard gossip algorithms. We illustrate these theoretical results with experimental comparisons between our algorithm and standard methods as applied to various classes of random fields.Comment: To appear, IEEE Transactions on Signal Processin
    corecore