64 research outputs found

    Strongly Refuting Random CSPs Below the Spectral Threshold

    Full text link
    Random constraint satisfaction problems (CSPs) are known to exhibit threshold phenomena: given a uniformly random instance of a CSP with nn variables and mm clauses, there is a value of m=Ω(n)m = \Omega(n) beyond which the CSP will be unsatisfiable with high probability. Strong refutation is the problem of certifying that no variable assignment satisfies more than a constant fraction of clauses; this is the natural algorithmic problem in the unsatisfiable regime (when m/n=ω(1)m/n = \omega(1)). Intuitively, strong refutation should become easier as the clause density m/nm/n grows, because the contradictions introduced by the random clauses become more locally apparent. For CSPs such as kk-SAT and kk-XOR, there is a long-standing gap between the clause density at which efficient strong refutation algorithms are known, m/nO~(nk/21)m/n \ge \widetilde O(n^{k/2-1}), and the clause density at which instances become unsatisfiable with high probability, m/n=ω(1)m/n = \omega (1). In this paper, we give spectral and sum-of-squares algorithms for strongly refuting random kk-XOR instances with clause density m/nO~(n(k/21)(1δ))m/n \ge \widetilde O(n^{(k/2-1)(1-\delta)}) in time exp(O~(nδ))\exp(\widetilde O(n^{\delta})) or in O~(nδ)\widetilde O(n^{\delta}) rounds of the sum-of-squares hierarchy, for any δ[0,1)\delta \in [0,1) and any integer k3k \ge 3. Our algorithms provide a smooth transition between the clause density at which polynomial-time algorithms are known at δ=0\delta = 0, and brute-force refutation at the satisfiability threshold when δ=1\delta = 1. We also leverage our kk-XOR results to obtain strong refutation algorithms for SAT (or any other Boolean CSP) at similar clause densities. Our algorithms match the known sum-of-squares lower bounds due to Grigoriev and Schonebeck, up to logarithmic factors. Additionally, we extend our techniques to give new results for certifying upper bounds on the injective tensor norm of random tensors

    Recovery Guarantees for Quadratic Tensors with Limited Observations

    Full text link
    We consider the tensor completion problem of predicting the missing entries of a tensor. The commonly used CP model has a triple product form, but an alternate family of quadratic models which are the sum of pairwise products instead of a triple product have emerged from applications such as recommendation systems. Non-convex methods are the method of choice for learning quadratic models, and this work examines their sample complexity and error guarantee. Our main result is that with the number of samples being only linear in the dimension, all local minima of the mean squared error objective are global minima and recover the original tensor accurately. The techniques lead to simple proofs showing that convex relaxation can recover quadratic tensors provided with linear number of samples. We substantiate our theoretical results with experiments on synthetic and real-world data, showing that quadratic models have better performance than CP models in scenarios where there are limited amount of observations available

    Hierarchies of Relaxations for Online Prediction Problems with Evolving Constraints

    Get PDF
    We study online prediction where regret of the algorithm is measured against a benchmark defined via evolving constraints. This framework captures online prediction on graphs, as well as other prediction problems with combinatorial structure. A key aspect here is that finding the optimal benchmark predictor (even in hindsight, given all the data) might be computationally hard due to the combinatorial nature of the constraints. Despite this, we provide polynomial-time \emph{prediction} algorithms that achieve low regret against combinatorial benchmark sets. We do so by building improper learning algorithms based on two ideas that work together. The first is to alleviate part of the computational burden through random playout, and the second is to employ Lasserre semidefinite hierarchies to approximate the resulting integer program. Interestingly, for our prediction algorithms, we only need to compute the values of the semidefinite programs and not the rounded solutions. However, the integrality gap for Lasserre hierarchy \emph{does} enter the generic regret bound in terms of Rademacher complexity of the benchmark set. This establishes a trade-off between the computation time and the regret bound of the algorithm

    CrypTen: Secure Multi-Party Computation Meets Machine Learning

    Full text link
    Secure multi-party computation (MPC) allows parties to perform computations on data while keeping that data private. This capability has great potential for machine-learning applications: it facilitates training of machine-learning models on private data sets owned by different parties, evaluation of one party's private model using another party's private data, etc. Although a range of studies implement machine-learning models via secure MPC, such implementations are not yet mainstream. Adoption of secure MPC is hampered by the absence of flexible software frameworks that "speak the language" of machine-learning researchers and engineers. To foster adoption of secure MPC in machine learning, we present CrypTen: a software framework that exposes popular secure MPC primitives via abstractions that are common in modern machine-learning frameworks, such as tensor computations, automatic differentiation, and modular neural networks. This paper describes the design of CrypTen and measure its performance on state-of-the-art models for text classification, speech recognition, and image classification. Our benchmarks show that CrypTen's GPU support and high-performance communication between (an arbitrary number of) parties allows it to perform efficient private evaluation of modern machine-learning models under a semi-honest threat model. For example, two parties using CrypTen can securely predict phonemes in speech recordings using Wav2Letter faster than real-time. We hope that CrypTen will spur adoption of secure MPC in the machine-learning community

    A Hypercontractive Inequality for Matrix-Valued Functions with Applications to Quantum Computing and LDCs

    Full text link
    The Bonami-Beckner hypercontractive inequality is a powerful tool in Fourier analysis of real-valued functions on the Boolean cube. In this paper we present a version of this inequality for matrix-valued functions on the Boolean cube. Its proof is based on a powerful inequality by Ball, Carlen, and Lieb. We also present a number of applications. First, we analyze maps that encode nn classical bits into mm qubits, in such a way that each set of kk bits can be recovered with some probability by an appropriate measurement on the quantum encoding; we show that if m<0.7nm<0.7 n, then the success probability is exponentially small in kk. This result may be viewed as a direct product version of Nayak's quantum random access code bound. It in turn implies strong direct product theorems for the one-way quantum communication complexity of Disjointness and other problems. Second, we prove that error-correcting codes that are locally decodable with 2 queries require length exponential in the length of the encoded string. This gives what is arguably the first ``non-quantum'' proof of a result originally derived by Kerenidis and de Wolf using quantum information theory, and answers a question by Trevisan.Comment: This is the full version of a paper that will appear in the proceedings of the IEEE FOCS 08 conferenc

    Decomposing Overcomplete 3rd Order Tensors using Sum-of-Squares Algorithms

    Get PDF
    Tensor rank and low-rank tensor decompositions have many applications in learning and complexity theory. Most known algorithms use unfoldings of tensors and can only handle rank up to np/2n^{\lfloor p/2 \rfloor} for a pp-th order tensor in Rnp\mathbb{R}^{n^p}. Previously no efficient algorithm can decompose 3rd order tensors when the rank is super-linear in the dimension. Using ideas from sum-of-squares hierarchy, we give the first quasi-polynomial time algorithm that can decompose a random 3rd order tensor decomposition when the rank is as large as n3/2/polylognn^{3/2}/\textrm{polylog} n. We also give a polynomial time algorithm for certifying the injective norm of random low rank tensors. Our tensor decomposition algorithm exploits the relationship between injective norm and the tensor components. The proof relies on interesting tools for decoupling random variables to prove better matrix concentration bounds, which can be useful in other settings
    corecore