    A new upper bound for 3-SAT

    We show that a randomly chosen 3-CNF formula over n variables with clauses-to-variables ratio at least 4.4898 is, as n grows large, asymptotically almost surely unsatisfiable. The previous best such bound, due to Dubois in 1999, was 4.506. The first such bound, independently discovered by many groups of researchers since 1983, was 5.19. Several decreasing values between 5.19 and 4.506 were published in the years between. The probabilistic techniques we use for the proof are, we believe, of independent interest.Comment: 20 page

    A Quantum Lovasz Local Lemma

    The Lovasz Local Lemma (LLL) is a powerful tool in probability theory to show the existence of combinatorial objects meeting a prescribed collection of "weakly dependent" criteria. We show that the LLL extends to a much more general geometric setting, where events are replaced with subspaces and probability is replaced with relative dimension, which allows to lower bound the dimension of the intersection of vector spaces under certain independence conditions. Our result immediately applies to the k-QSAT problem: For instance we show that any collection of rank 1 projectors with the property that each qubit appears in at most 2k/(e⋅k)2^k/(e \cdot k) of them, has a joint satisfiable state. We then apply our results to the recently studied model of random k-QSAT. Recent works have shown that the satisfiable region extends up to a density of 1 in the large k limit, where the density is the ratio of projectors to qubits. Using a hybrid approach building on work by Laumann et al. we greatly extend the known satisfiable region for random k-QSAT to a density of Ω(2k/k2)\Omega(2^k/k^2). Since our tool allows us to show the existence of joint satisfying states without the need to construct them, we are able to penetrate into regions where the satisfying states are conjectured to be entangled, avoiding the need to construct them, which has limited previous approaches to product states.Comment: 19 page

    Sum of squares lower bounds for refuting any CSP

    Let P:{0,1}k→{0,1}P:\{0,1\}^k \to \{0,1\} be a nontrivial kk-ary predicate. Consider a random instance of the constraint satisfaction problem CSP(P)\mathrm{CSP}(P) on nn variables with Δn\Delta n constraints, each being PP applied to kk randomly chosen literals. Provided the constraint density satisfies Δ≫1\Delta \gg 1, such an instance is unsatisfiable with high probability. The \emph{refutation} problem is to efficiently find a proof of unsatisfiability. We show that whenever the predicate PP supports a tt-\emph{wise uniform} probability distribution on its satisfying assignments, the sum of squares (SOS) algorithm of degree d=Θ(nΔ2/(t−1)log⁡Δ)d = \Theta(\frac{n}{\Delta^{2/(t-1)} \log \Delta}) (which runs in time nO(d)n^{O(d)}) \emph{cannot} refute a random instance of CSP(P)\mathrm{CSP}(P). In particular, the polynomial-time SOS algorithm requires Ω~(n(t+1)/2)\widetilde{\Omega}(n^{(t+1)/2}) constraints to refute random instances of CSP(P)(P) when PP supports a tt-wise uniform distribution on its satisfying assignments. Together with recent work of Lee et al. [LRS15], our result also implies that \emph{any} polynomial-size semidefinite programming relaxation for refutation requires at least Ω~(n(t+1)/2)\widetilde{\Omega}(n^{(t+1)/2}) constraints. Our results (which also extend with no change to CSPs over larger alphabets) subsume all previously known lower bounds for semialgebraic refutation of random CSPs. For every constraint predicate~PP, they give a three-way hardness tradeoff between the density of constraints, the SOS degree (hence running time), and the strength of the refutation. By recent algorithmic results of Allen et al. [AOW15] and Raghavendra et al. [RRS16], this full three-way tradeoff is \emph{tight}, up to lower-order factors.Comment: 39 pages, 1 figur

    The Satisfiability Threshold for a Seemingly Intractable Random Constraint Satisfaction Problem

    We determine the exact threshold of satisfiability for random instances of a particular NP-complete constraint satisfaction problem (CSP). This is the first random CSP model for which we have determined a precise linear satisfiability threshold, and for which random instances with density near that threshold appear to be computationally difficult. More formally, it is the first random CSP model for which the satisfiability threshold is known and which shares the following characteristics with random k-SAT for k >= 3. The problem is NP-complete, the satisfiability threshold occurs when there is a linear number of clauses, and a uniformly random instance with a linear number of clauses asymptotically almost surely has exponential resolution complexity.Comment: This is the long version of a paper that will be published in the SIAM Journal on Discrete Mathematics. This long version includes an appendix and a computer program. The contents of the paper are unchanged in the latest version. The format of the arxiv submission was changed so that the computer program will appear as an ancillary file. Some comments in the computer program were update

    Algorithms and algorithmic obstacles for probabilistic combinatorial structures

    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 209-214).We study efficient average-case (approximation) algorithms for combinatorial optimization problems, as well as explore the algorithmic obstacles for a variety of discrete optimization problems arising in the theory of random graphs, statistics and machine learning. In particular, we consider the average-case optimization for three NP-hard combinatorial optimization problems: Large Submatrix Selection, Maximum Cut (Max-Cut) of a graph and Matrix Completion. The Large Submatrix Selection problem is to find a k x k submatrix of an n x n matrix with i.i.d. standard Gaussian entries, which has the largest average entry. It was shown in [13] using non-constructive methods that the largest average value of a k x k submatrix is 2(1 + o(1) [square root] log n/k with high probability (w.h.p.) when k = O(log n/ log log n). We show that a natural greedy algorithm called Largest Average Submatrix LAS produces a submatrix with average value (1+ o(1)) [square root] 2 log n/k w.h.p. when k is constant and n grows, namely approximately [square root] 2 smaller. Then by drawing an analogy with the problem of finding cliques in random graphs, we propose a simple greedy algorithm which produces a k x k matrix with asymptotically the same average value (1+o(1) [square root] 2log n/k w.h.p., for k = o(log n). Since the maximum clique problem is a special case of the largest submatrix problem and the greedy algorithm is the best known algorithm for finding cliques in random graphs, it is tempting to believe that beating the factor [square root] 2 performance gap suffered by both algorithms might be very challenging. Surprisingly, we show the existence of a very simple algorithm which produces a k x k matrix with average value (1 + o[subscript]k(1) + o(1))(4/3) [square root] 2log n/k for k = o((log n)Âč.⁔), that is, with asymptotic factor 4/3 when k grows. To get an insight into the algorithmic hardness of this problem, and motivated by methods originating in the theory of spin glasses, we conduct the so-called expected overlap analysis of matrices with average value asymptotically (1 + o(1))[alpha][square root] 2 log n/k for a fixed value [alpha] [epsilon] [1, fixed value a E [1, [square root]2]. The overlap corresponds to the number of common rows and common columns for pairs of matrices achieving this value. We discover numerically an intriguing phase transition at [alpha]* [delta]= 5[square root]2/(3[square root]3) ~~ 1.3608.. [epsilon] [4/3, [square root]2]: when [alpha] [alpha]*, appropriately defined. We conjecture that OGP observed for [alpha] > [alpha]* also marks the onset of the algorithmic hardness - no polynomial time algorithm exists for finding matrices with average value at least (1+o(1)[alpha][square root]2log n/k, when [alpha] > [alpha]* and k is a growing function of n. Finding a maximum cut of a graph is a well-known canonical NP-hard problem. We consider the problem of estimating the size of a maximum cut in a random ErdƑs-RĂ©nyi graph on n nodes and [cn] edges. We establish that the size of the maximum cut normalized by the number of nodes belongs to the interval [c/2 + 0.47523[square root]c,c/2 + 0.55909[square root]c] w.h.p. as n increases, for all sufficiently large c. We observe that every maximum size cut satisfies a certain local optimality property, and we compute the expected number of cuts with a given value satisfying this local optimality property. Estimating this expectation amounts to solving a rather involved multi-dimensional large deviations problem. We solve this underlying large deviation problem asymptotically as c increases and use it to obtain an improved upper bound on the Max-Cut value. The lower bound is obtained by application of the second moment method, coupled with the same local optimality constraint, and is shown to work up to the stated lower bound value c/2 + 0.47523[square root]c. We also obtain an improved lower bound of 1.36000n on the Max-Cut for the random cubic graph or any cubic graph with large girth, improving the previous best bound of 1.33773n. Matrix Completion is the problem of reconstructing a rank-k n x n matrix M from a sampling of its entries. We propose a new matrix completion algorithm using a novel sampling scheme based on a union of independent sparse random regular bipartite graphs. We show that under a certain incoherence assumption on M and for the case when both the rank and the condition number of M are bounded, w.h.p. our algorithm recovers an [epsilon]-approximation of M in terms of the Frobenius norm using O(nlogÂČ (1/[epsilon])) samples and in linear time O(nlogÂČ (1/[epsilon])). This provides the best known bounds both on the sample complexity and computational cost for reconstructing (approximately) an unknown low-rank matrix. The novelty of our algorithm is two new steps of thresholding singular values and rescaling singular vectors in the application of the "vanilla" alternating minimization algorithm. The structure of sparse random regular graphs is used heavily for controlling the impact of these regularization steps.by Quan Li.Ph. D

