    A Nearly Tight Sum-of-Squares Lower Bound for the Planted Clique Problem

    We prove that with high probability over the choice of a random graph GG from the Erd\H{o}s-R\'enyi distribution G(n,1/2)G(n,1/2), the nO(d)n^{O(d)}-time degree dd Sum-of-Squares semidefinite programming relaxation for the clique problem will give a value of at least n1/2c(d/logn)1/2n^{1/2-c(d/\log n)^{1/2}} for some constant c>0c>0. This yields a nearly tight n1/2o(1)n^{1/2 - o(1)} bound on the value of this program for any degree d=o(logn)d = o(\log n). Moreover we introduce a new framework that we call \emph{pseudo-calibration} to construct Sum of Squares lower bounds. This framework is inspired by taking a computational analog of Bayesian probability theory. It yields a general recipe for constructing good pseudo-distributions (i.e., dual certificates for the Sum-of-Squares semidefinite program), and sheds further light on the ways in which this hierarchy differs from others.Comment: 55 page

    Fast, Sample-Efficient, Affine-Invariant Private Mean and Covariance Estimation for Subgaussian Distributions

    We present a fast, differentially private algorithm for high-dimensional covariance-aware mean estimation with nearly optimal sample complexity. Only exponential-time estimators were previously known to achieve this guarantee. Given nn samples from a (sub-)Gaussian distribution with unknown mean μ\mu and covariance Σ\Sigma, our (ε,δ)(\varepsilon,\delta)-differentially private estimator produces μ~\tilde{\mu} such that μμ~Σα\|\mu - \tilde{\mu}\|_{\Sigma} \leq \alpha as long as ndα2+dlog1/δαε+dlog1/δεn \gtrsim \tfrac d {\alpha^2} + \tfrac{d \sqrt{\log 1/\delta}}{\alpha \varepsilon}+\frac{d\log 1/\delta}{\varepsilon}. The Mahalanobis error metric μμ^Σ\|\mu - \hat{\mu}\|_{\Sigma} measures the distance between μ^\hat \mu and μ\mu relative to Σ\Sigma; it characterizes the error of the sample mean. Our algorithm runs in time O~(ndω1+nd/ε)\tilde{O}(nd^{\omega - 1} + nd/\varepsilon), where ω<2.38\omega < 2.38 is the matrix multiplication exponent. We adapt an exponential-time approach of Brown, Gaboardi, Smith, Ullman, and Zakynthinou (2021), giving efficient variants of stable mean and covariance estimation subroutines that also improve the sample complexity to the nearly optimal bound above. Our stable covariance estimator can be turned to private covariance estimation for unrestricted subgaussian distributions. With nd3/2n\gtrsim d^{3/2} samples, our estimate is accurate in spectral norm. This is the first such algorithm using n=o(d2)n= o(d^2) samples, answering an open question posed by Alabi et al. (2022). With nd2n\gtrsim d^2 samples, our estimate is accurate in Frobenius norm. This leads to a fast, nearly optimal algorithm for private learning of unrestricted Gaussian distributions in TV distance. Duchi, Haque, and Kuditipudi (2023) obtained similar results independently and concurrently.Comment: 44 pages. New version fixes typos and includes additional exposition and discussion of related wor

    The power of sum-of-squares for detecting hidden structures

    We study planted problems---finding hidden structures in random noisy inputs---through the lens of the sum-of-squares semidefinite programming hierarchy (SoS). This family of powerful semidefinite programs has recently yielded many new algorithms for planted problems, often achieving the best known polynomial-time guarantees in terms of accuracy of recovered solutions and robustness to noise. One theme in recent work is the design of spectral algorithms which match the guarantees of SoS algorithms for planted problems. Classical spectral algorithms are often unable to accomplish this: the twist in these new spectral algorithms is the use of spectral structure of matrices whose entries are low-degree polynomials of the input variables. We prove that for a wide class of planted problems, including refuting random constraint satisfaction problems, tensor and sparse PCA, densest-k-subgraph, community detection in stochastic block models, planted clique, and others, eigenvalues of degree-d matrix polynomials are as powerful as SoS semidefinite programs of roughly degree d. For such problems it is therefore always possible to match the guarantees of SoS without solving a large semidefinite program. Using related ideas on SoS algorithms and low-degree matrix polynomials (and inspired by recent work on SoS and the planted clique problem by Barak et al.), we prove new nearly-tight SoS lower bounds for the tensor and sparse principal component analysis problems. Our lower bounds for sparse principal component analysis are the first to suggest that going beyond existing algorithms for this problem may require sub-exponential time