    Solving Satisfiability Modulo Counting for Symbolic and Statistical AI Integration With Provable Guarantees

    Satisfiability Modulo Counting (SMC) encompasses problems that require both symbolic decision-making and statistical reasoning. Its general formulation captures many real-world problems at the intersection of symbolic and statistical Artificial Intelligence. SMC searches for policy interventions to control probabilistic outcomes. Solving SMC is challenging because of its highly intractable nature(NPPP\text{NP}^{\text{PP}}-complete), incorporating statistical inference and symbolic reasoning. Previous research on SMC solving lacks provable guarantees and/or suffers from sub-optimal empirical performance, especially when combinatorial constraints are present. We propose XOR-SMC, a polynomial algorithm with access to NP-oracles, to solve highly intractable SMC problems with constant approximation guarantees. XOR-SMC transforms the highly intractable SMC into satisfiability problems, by replacing the model counting in SMC with SAT formulae subject to randomized XOR constraints. Experiments on solving important SMC problems in AI for social good demonstrate that XOR-SMC finds solutions close to the true optimum, outperforming several baselines which struggle to find good approximations for the intractable model counting in SMC

    XOR-Sampling for Network Design with Correlated Stochastic Events

    Many network optimization problems can be formulated as stochastic network design problems in which edges are present or absent stochastically. Furthermore, protective actions can guarantee that edges will remain present. We consider the problem of finding the optimal protection strategy under a budget limit in order to maximize some connectivity measurements of the network. Previous approaches rely on the assumption that edges are independent. In this paper, we consider a more realistic setting where multiple edges are not independent due to natural disasters or regional events that make the states of multiple edges stochastically correlated. We use Markov Random Fields to model the correlation and define a new stochastic network design framework. We provide a novel algorithm based on Sample Average Approximation (SAA) coupled with a Gibbs or XOR sampler. The experimental results on real road network data show that the policies produced by SAA with the XOR sampler have higher quality and lower variance compared to SAA with Gibbs sampler.Comment: In Proceedings of the Twenty-sixth International Joint Conference on Artificial Intelligence (IJCAI-17). The first two authors contribute equall

    On the Complexity of Random Satisfiability Problems with Planted Solutions

    The problem of identifying a planted assignment given a random kk-SAT formula consistent with the assignment exhibits a large algorithmic gap: while the planted solution becomes unique and can be identified given a formula with O(nlogn)O(n\log n) clauses, there are distributions over clauses for which the best known efficient algorithms require nk/2n^{k/2} clauses. We propose and study a unified model for planted kk-SAT, which captures well-known special cases. An instance is described by a planted assignment σ\sigma and a distribution on clauses with kk literals. We define its distribution complexity as the largest rr for which the distribution is not rr-wise independent (1rk1 \le r \le k for any distribution with a planted assignment). Our main result is an unconditional lower bound, tight up to logarithmic factors, for statistical (query) algorithms [Kearns 1998, Feldman et. al 2012], matching known upper bounds, which, as we show, can be implemented using a statistical algorithm. Since known approaches for problems over distributions have statistical analogues (spectral, MCMC, gradient-based, convex optimization etc.), this lower bound provides a rigorous explanation of the observed algorithmic gap. The proof introduces a new general technique for the analysis of statistical query algorithms. It also points to a geometric paring phenomenon in the space of all planted assignments. We describe consequences of our lower bounds to Feige's refutation hypothesis [Feige 2002] and to lower bounds on general convex programs that solve planted kk-SAT. Our bounds also extend to other planted kk-CSP models, and, in particular, provide concrete evidence for the security of Goldreich's one-way function and the associated pseudorandom generator when used with a sufficiently hard predicate [Goldreich 2000].Comment: Extended abstract appeared in STOC 201