64 research outputs found
Strongly Refuting Random CSPs Below the Spectral Threshold
Random constraint satisfaction problems (CSPs) are known to exhibit threshold
phenomena: given a uniformly random instance of a CSP with variables and
clauses, there is a value of beyond which the CSP will be
unsatisfiable with high probability. Strong refutation is the problem of
certifying that no variable assignment satisfies more than a constant fraction
of clauses; this is the natural algorithmic problem in the unsatisfiable regime
(when ).
Intuitively, strong refutation should become easier as the clause density
grows, because the contradictions introduced by the random clauses become
more locally apparent. For CSPs such as -SAT and -XOR, there is a
long-standing gap between the clause density at which efficient strong
refutation algorithms are known, , and the
clause density at which instances become unsatisfiable with high probability,
.
In this paper, we give spectral and sum-of-squares algorithms for strongly
refuting random -XOR instances with clause density in time or in
rounds of the sum-of-squares hierarchy, for any
and any integer . Our algorithms provide a smooth
transition between the clause density at which polynomial-time algorithms are
known at , and brute-force refutation at the satisfiability
threshold when . We also leverage our -XOR results to obtain
strong refutation algorithms for SAT (or any other Boolean CSP) at similar
clause densities. Our algorithms match the known sum-of-squares lower bounds
due to Grigoriev and Schonebeck, up to logarithmic factors.
Additionally, we extend our techniques to give new results for certifying
upper bounds on the injective tensor norm of random tensors
Recovery Guarantees for Quadratic Tensors with Limited Observations
We consider the tensor completion problem of predicting the missing entries
of a tensor. The commonly used CP model has a triple product form, but an
alternate family of quadratic models which are the sum of pairwise products
instead of a triple product have emerged from applications such as
recommendation systems. Non-convex methods are the method of choice for
learning quadratic models, and this work examines their sample complexity and
error guarantee. Our main result is that with the number of samples being only
linear in the dimension, all local minima of the mean squared error objective
are global minima and recover the original tensor accurately. The techniques
lead to simple proofs showing that convex relaxation can recover quadratic
tensors provided with linear number of samples. We substantiate our theoretical
results with experiments on synthetic and real-world data, showing that
quadratic models have better performance than CP models in scenarios where
there are limited amount of observations available
Hierarchies of Relaxations for Online Prediction Problems with Evolving Constraints
We study online prediction where regret of the algorithm is measured against
a benchmark defined via evolving constraints. This framework captures online
prediction on graphs, as well as other prediction problems with combinatorial
structure. A key aspect here is that finding the optimal benchmark predictor
(even in hindsight, given all the data) might be computationally hard due to
the combinatorial nature of the constraints. Despite this, we provide
polynomial-time \emph{prediction} algorithms that achieve low regret against
combinatorial benchmark sets. We do so by building improper learning algorithms
based on two ideas that work together. The first is to alleviate part of the
computational burden through random playout, and the second is to employ
Lasserre semidefinite hierarchies to approximate the resulting integer program.
Interestingly, for our prediction algorithms, we only need to compute the
values of the semidefinite programs and not the rounded solutions. However, the
integrality gap for Lasserre hierarchy \emph{does} enter the generic regret
bound in terms of Rademacher complexity of the benchmark set. This establishes
a trade-off between the computation time and the regret bound of the algorithm
CrypTen: Secure Multi-Party Computation Meets Machine Learning
Secure multi-party computation (MPC) allows parties to perform computations
on data while keeping that data private. This capability has great potential
for machine-learning applications: it facilitates training of machine-learning
models on private data sets owned by different parties, evaluation of one
party's private model using another party's private data, etc. Although a range
of studies implement machine-learning models via secure MPC, such
implementations are not yet mainstream. Adoption of secure MPC is hampered by
the absence of flexible software frameworks that "speak the language" of
machine-learning researchers and engineers. To foster adoption of secure MPC in
machine learning, we present CrypTen: a software framework that exposes popular
secure MPC primitives via abstractions that are common in modern
machine-learning frameworks, such as tensor computations, automatic
differentiation, and modular neural networks. This paper describes the design
of CrypTen and measure its performance on state-of-the-art models for text
classification, speech recognition, and image classification. Our benchmarks
show that CrypTen's GPU support and high-performance communication between (an
arbitrary number of) parties allows it to perform efficient private evaluation
of modern machine-learning models under a semi-honest threat model. For
example, two parties using CrypTen can securely predict phonemes in speech
recordings using Wav2Letter faster than real-time. We hope that CrypTen will
spur adoption of secure MPC in the machine-learning community
A Hypercontractive Inequality for Matrix-Valued Functions with Applications to Quantum Computing and LDCs
The Bonami-Beckner hypercontractive inequality is a powerful tool in Fourier
analysis of real-valued functions on the Boolean cube. In this paper we present
a version of this inequality for matrix-valued functions on the Boolean cube.
Its proof is based on a powerful inequality by Ball, Carlen, and Lieb. We also
present a number of applications. First, we analyze maps that encode
classical bits into qubits, in such a way that each set of bits can be
recovered with some probability by an appropriate measurement on the quantum
encoding; we show that if , then the success probability is
exponentially small in . This result may be viewed as a direct product
version of Nayak's quantum random access code bound. It in turn implies strong
direct product theorems for the one-way quantum communication complexity of
Disjointness and other problems. Second, we prove that error-correcting codes
that are locally decodable with 2 queries require length exponential in the
length of the encoded string. This gives what is arguably the first
``non-quantum'' proof of a result originally derived by Kerenidis and de Wolf
using quantum information theory, and answers a question by Trevisan.Comment: This is the full version of a paper that will appear in the
proceedings of the IEEE FOCS 08 conferenc
Decomposing Overcomplete 3rd Order Tensors using Sum-of-Squares Algorithms
Tensor rank and low-rank tensor decompositions have many applications in
learning and complexity theory. Most known algorithms use unfoldings of tensors
and can only handle rank up to for a -th order
tensor in . Previously no efficient algorithm can decompose
3rd order tensors when the rank is super-linear in the dimension. Using ideas
from sum-of-squares hierarchy, we give the first quasi-polynomial time
algorithm that can decompose a random 3rd order tensor decomposition when the
rank is as large as .
We also give a polynomial time algorithm for certifying the injective norm of
random low rank tensors. Our tensor decomposition algorithm exploits the
relationship between injective norm and the tensor components. The proof relies
on interesting tools for decoupling random variables to prove better matrix
concentration bounds, which can be useful in other settings
- …