264 research outputs found

    Subsampling Mathematical Relaxations and Average-case Complexity

    Full text link
    We initiate a study of when the value of mathematical relaxations such as linear and semidefinite programs for constraint satisfaction problems (CSPs) is approximately preserved when restricting the instance to a sub-instance induced by a small random subsample of the variables. Let CC be a family of CSPs such as 3SAT, Max-Cut, etc., and let Π\Pi be a relaxation for CC, in the sense that for every instance P∈CP\in C, Π(P)\Pi(P) is an upper bound the maximum fraction of satisfiable constraints of PP. Loosely speaking, we say that subsampling holds for CC and Π\Pi if for every sufficiently dense instance P∈CP \in C and every Ï”>0\epsilon>0, if we let Pâ€ČP' be the instance obtained by restricting PP to a sufficiently large constant number of variables, then Π(Pâ€Č)∈(1±ϔ)Π(P)\Pi(P') \in (1\pm \epsilon)\Pi(P). We say that weak subsampling holds if the above guarantee is replaced with Π(Pâ€Č)=1−Θ(Îł)\Pi(P')=1-\Theta(\gamma) whenever Π(P)=1−γ\Pi(P)=1-\gamma. We show: 1. Subsampling holds for the BasicLP and BasicSDP programs. BasicSDP is a variant of the relaxation considered by Raghavendra (2008), who showed it gives an optimal approximation factor for every CSP under the unique games conjecture. BasicLP is the linear programming analog of BasicSDP. 2. For tighter versions of BasicSDP obtained by adding additional constraints from the Lasserre hierarchy, weak subsampling holds for CSPs of unique games type. 3. There are non-unique CSPs for which even weak subsampling fails for the above tighter semidefinite programs. Also there are unique CSPs for which subsampling fails for the Sherali-Adams linear programming hierarchy. As a corollary of our weak subsampling for strong semidefinite programs, we obtain a polynomial-time algorithm to certify that random geometric graphs (of the type considered by Feige and Schechtman, 2002) of max-cut value 1−γ1-\gamma have a cut value at most 1−γ/101-\gamma/10.Comment: Includes several more general results that subsume the previous version of the paper

    Subsampling Algorithms for Semidefinite Programming

    Full text link
    We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls granularity, i.e. the tradeoff between cost per iteration and total number of iterations. Furthermore, the total computational cost is directly proportional to the complexity (i.e. rank) of the solution. We study numerical performance on some large-scale problems arising in statistical learning.Comment: Final version, to appear in Stochastic System

    Hierarchies of Relaxations for Online Prediction Problems with Evolving Constraints

    Get PDF
    We study online prediction where regret of the algorithm is measured against a benchmark defined via evolving constraints. This framework captures online prediction on graphs, as well as other prediction problems with combinatorial structure. A key aspect here is that finding the optimal benchmark predictor (even in hindsight, given all the data) might be computationally hard due to the combinatorial nature of the constraints. Despite this, we provide polynomial-time \emph{prediction} algorithms that achieve low regret against combinatorial benchmark sets. We do so by building improper learning algorithms based on two ideas that work together. The first is to alleviate part of the computational burden through random playout, and the second is to employ Lasserre semidefinite hierarchies to approximate the resulting integer program. Interestingly, for our prediction algorithms, we only need to compute the values of the semidefinite programs and not the rounded solutions. However, the integrality gap for Lasserre hierarchy \emph{does} enter the generic regret bound in terms of Rademacher complexity of the benchmark set. This establishes a trade-off between the computation time and the regret bound of the algorithm

    Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems

    Full text link
    Optimization methods are at the core of many problems in signal/image processing, computer vision, and machine learning. For a long time, it has been recognized that looking at the dual of an optimization problem may drastically simplify its solution. Deriving efficient strategies which jointly brings into play the primal and the dual problems is however a more recent idea which has generated many important new contributions in the last years. These novel developments are grounded on recent advances in convex analysis, discrete optimization, parallel processing, and non-smooth optimization with emphasis on sparsity issues. In this paper, we aim at presenting the principles of primal-dual approaches, while giving an overview of numerical methods which have been proposed in different contexts. We show the benefits which can be drawn from primal-dual algorithms both for solving large-scale convex optimization problems and discrete ones, and we provide various application examples to illustrate their usefulness

    VerdictDB: Universalizing Approximate Query Processing

    Full text link
    Despite 25 years of research in academia, approximate query processing (AQP) has had little industrial adoption. One of the major causes of this slow adoption is the reluctance of traditional vendors to make radical changes to their legacy codebases, and the preoccupation of newer vendors (e.g., SQL-on-Hadoop products) with implementing standard features. Additionally, the few AQP engines that are available are each tied to a specific platform and require users to completely abandon their existing databases---an unrealistic expectation given the infancy of the AQP technology. Therefore, we argue that a universal solution is needed: a database-agnostic approximation engine that will widen the reach of this emerging technology across various platforms. Our proposal, called VerdictDB, uses a middleware architecture that requires no changes to the backend database, and thus, can work with all off-the-shelf engines. Operating at the driver-level, VerdictDB intercepts analytical queries issued to the database and rewrites them into another query that, if executed by any standard relational engine, will yield sufficient information for computing an approximate answer. VerdictDB uses the returned result set to compute an approximate answer and error estimates, which are then passed on to the user or application. However, lack of access to the query execution layer introduces significant challenges in terms of generality, correctness, and efficiency. This paper shows how VerdictDB overcomes these challenges and delivers up to 171×\times speedup (18.45×\times on average) for a variety of existing engines, such as Impala, Spark SQL, and Amazon Redshift, while incurring less than 2.6% relative error. VerdictDB is open-sourced under Apache License.Comment: Extended technical report of the paper that appeared in Proceedings of the 2018 International Conference on Management of Data, pp. 1461-1476. ACM, 201
    • 

    corecore