195 research outputs found

    An Empirical Evaluation of k-Means Coresets

    Get PDF
    Coresets are among the most popular paradigms for summarizing data. In particular, there exist many high performance coresets for clustering problems such as k-means in both theory and practice. Curiously, there exists no work on comparing the quality of available k-means coresets. In this paper we perform such an evaluation. There currently is no algorithm known to measure the distortion of a candidate coreset. We provide some evidence as to why this might be computationally difficult. To complement this, we propose a benchmark for which we argue that computing coresets is challenging and which also allows us an easy (heuristic) evaluation of coresets. Using this benchmark and real-world data sets, we conduct an exhaustive evaluation of the most commonly used coreset algorithms from theory and practice

    A Permutation-Equivariant Neural Network Architecture For Auction Design

    Full text link
    Designing an incentive compatible auction that maximizes expected revenue is a central problem in Auction Design. Theoretical approaches to the problem have hit some limits in the past decades and analytical solutions are known for only a few simple settings. Computational approaches to the problem through the use of LPs have their own set of limitations. Building on the success of deep learning, a new approach was recently proposed by Duetting et al. (2019) in which the auction is modeled by a feed-forward neural network and the design problem is framed as a learning problem. The neural architectures used in that work are general purpose and do not take advantage of any of the symmetries the problem could present, such as permutation equivariance. In this work, we consider auction design problems that have permutation-equivariant symmetry and construct a neural architecture that is capable of perfectly recovering the permutation-equivariant optimal mechanism, which we show is not possible with the previous architecture. We demonstrate that permutation-equivariant architectures are not only capable of recovering previous results, they also have better generalization properties

    On Generalization Bounds for Projective Clustering

    Full text link
    Given a set of points, clustering consists of finding a partition of a point set into kk clusters such that the center to which a point is assigned is as close as possible. Most commonly, centers are points themselves, which leads to the famous kk-median and kk-means objectives. One may also choose centers to be jj dimensional subspaces, which gives rise to subspace clustering. In this paper, we consider learning bounds for these problems. That is, given a set of nn samples PP drawn independently from some unknown, but fixed distribution D\mathcal{D}, how quickly does a solution computed on PP converge to the optimal clustering of D\mathcal{D}? We give several near optimal results. In particular, For center-based objectives, we show a convergence rate of O~(k/n)\tilde{O}\left(\sqrt{{k}/{n}}\right). This matches the known optimal bounds of [Fefferman, Mitter, and Narayanan, Journal of the Mathematical Society 2016] and [Bartlett, Linder, and Lugosi, IEEE Trans. Inf. Theory 1998] for kk-means and extends it to other important objectives such as kk-median. For subspace clustering with jj-dimensional subspaces, we show a convergence rate of O~(kj2n)\tilde{O}\left(\sqrt{\frac{kj^2}{n}}\right). These are the first provable bounds for most of these problems. For the specific case of projective clustering, which generalizes kk-means, we show a convergence rate of Ω(kjn)\Omega\left(\sqrt{\frac{kj}{n}}\right) is necessary, thereby proving that the bounds from [Fefferman, Mitter, and Narayanan, Journal of the Mathematical Society 2016] are essentially optimal

    Learning Reserve Prices in Second-Price Auctions

    Get PDF
    This paper proves the tight sample complexity of Second-Price Auction with Anonymous Reserve, up to a logarithmic factor, for all value distribution families that have been considered in the literature. Compared to Myerson Auction, whose sample complexity was settled very recently in (Guo, Huang and Zhang, STOC 2019), Anonymous Reserve requires much fewer samples for learning. We follow a similar framework as the Guo-Huang-Zhang work, but replace their information theoretical argument with a direct proof

    Good approximate quantum LDPC codes from spacetime circuit Hamiltonians

    Get PDF
    We study approximate quantum low-density parity-check (QLDPC) codes, which are approximate quantum error-correcting codes specified as the ground space of a frustration-free local Hamiltonian, whose terms do not necessarily commute. Such codes generalize stabilizer QLDPC codes, which are exact quantum error-correcting codes with sparse, low-weight stabilizer generators (i.e. each stabilizer generator acts on a few qubits, and each qubit participates in a few stabilizer generators). Our investigation is motivated by an important question in Hamiltonian complexity and quantum coding theory: do stabilizer QLDPC codes with constant rate, linear distance, and constant-weight stabilizers exist? We show that obtaining such optimal scaling of parameters (modulo polylogarithmic corrections) is possible if we go beyond stabilizer codes: we prove the existence of a family of [[N,k,d,Δ]] approximate QLDPC codes that encode k = Ω(N) logical qubits into N physical qubits with distance d = Ω(N) and approximation infidelity Δ = 1/(N). The code space is stabilized by a set of 10-local noncommuting projectors, with each physical qubit only participating in N projectors. We prove the existence of an efficient encoding map and show that the spectral gap of the code Hamiltonian scales as Ω(N^(−3.09)). We also show that arbitrary Pauli errors can be locally detected by circuits of polylogarithmic depth. Our family of approximate QLDPC codes is based on applying a recent connection between circuit Hamiltonians and approximate quantum codes (Nirkhe, et al., ICALP 2018) to a result showing that random Clifford circuits of polylogarithmic depth yield asymptotically good quantum codes (Brown and Fawzi, ISIT 2013). Then, in order to obtain a code with sparse checks and strong detection of local errors, we use a spacetime circuit-to-Hamiltonian construction in order to take advantage of the parallelism of the Brown-Fawzi circuits. Because of this, we call our codes spacetime codes. The analysis of the spectral gap of the code Hamiltonian is the main technical contribution of this work. We show that for any depth D quantum circuit on n qubits there is an associated spacetime circuit-to-Hamiltonian construction with spectral gap Ω(n^(−3.09)D⁻ÂČ log⁻⁶ (n)). To lower bound this gap we use a Markov chain decomposition method to divide the state space of partially completed circuit configurations into overlapping subsets corresponding to uniform circuit segments of depth logn, which are based on bitonic sorting circuits. We use the combinatorial properties of these circuit configurations to show rapid mixing between the subsets, and within the subsets we develop a novel isomorphism between the local update Markov chain on bitonic circuit configurations and the edge-flip Markov chain on equal-area dyadic tilings, whose mixing time was recently shown to be polynomial (Cannon, Levin, and Stauffer, RANDOM 2017). Previous lower bounds on the spectral gap of spacetime circuit Hamiltonians have all been based on a connection to exactly solvable quantum spin chains and applied only to 1+1 dimensional nearest-neighbor quantum circuits with at least linear depth

    Learning Reserve Prices in Second-Price Auctions

    Get PDF
    This paper proves the tight sample complexity of Second-Price Auction with Anonymous Reserve, up to a logarithmic factor, for each of all the value distribution families studied in the literature: [0,1]-bounded, [1,H]-bounded, regular, and monotone hazard rate (MHR). Remarkably, the setting-specific tight sample complexity poly(?^{-1}) depends on the precision ? ? (0, 1), but not on the number of bidders n ? 1. Further, in the two bounded-support settings, our learning algorithm allows correlated value distributions. In contrast, the tight sample complexity ??(n) ? poly(?^{-1}) of Myerson Auction proved by Guo, Huang and Zhang (STOC 2019) has a nearly-linear dependence on n ? 1, and holds only for independent value distributions in every setting. We follow a similar framework as the Guo-Huang-Zhang work, but replace their information theoretical arguments with a direct proof

    Submodular Maximization subject to a Knapsack Constraint: Combinatorial Algorithms with Near-optimal Adaptive Complexity *

    Get PDF
    International audienceThe growing need to deal with massive instances motivates the design of algorithms balancing the quality of the solution with applicability. For the latter, an important measure is the adaptive complexity, capturing the number of sequential rounds of parallel computation needed. In this work we obtain the first constant factor approximation algorithm for non-monotone submodular maximization subject to a knapsack constraint with near-optimal O(log n) adaptive complexity. Low adaptivity by itself, however, is not enough: one needs to account for the total number of function evaluations (or value queries) as well. Our algorithm asksÕ(n 2) value queries, but can be modified to run with onlyÕ(n) instead, while retaining a low adaptive complexity of O(log 2 n). Besides the above improvement in adaptivity, this is also the first combinatorial approach with sublinear adaptive complexity for the problem and yields algorithms comparable to the state-of-the-art even for the special cases of cardinality constraints or monotone objectives. Finally, we showcase our algorithms' applicability on real-world datasets

    A strongly polynomial algorithm for linear exchange markets

    Get PDF
    We present a strongly polynomial algorithm for computing an equilibrium in Arrow-Debreu exchange markets with linear utilities. Our algorithm is based on a variant of the weakly-polynomial Duan–Mehlhorn (DM) algorithm. We use the DM algorithm as a subroutine to identify revealed edges, i.e. pairs of agents and goods that must correspond to best bang-per-buck transactions in every equilibrium solution. Every time a new revealed edge is found, we use another subroutine that decides if there is an optimal solution using the current set of revealed edges, or if none exists, finds the solution that approximately minimizes the violation of the demand and supply constraints. This task can be reduced to solving a linear program (LP). Even though we are unable to solve this LP in strongly polynomial time, we show that it can be approximated by a simpler LP with two variables per inequality that is solvable in strongly polynomial time
    • 

    corecore