20 research outputs found

    Parallel Successive Convex Approximation for Nonsmooth Nonconvex Optimization

    Get PDF
    Consider the problem of minimizing the sum of a smooth (possibly non-convex) and a convex (possibly nonsmooth) function involving a large number of variables. A popular approach to solve this problem is the block coordinate descent (BCD) method whereby at each iteration only one variable block is updated while the remaining variables are held fixed. With the recent advances in the developments of the multi-core parallel processing technology, it is desirable to parallelize the BCD method by allowing multiple blocks to be updated simultaneously at each iteration of the algorithm. In this work, we propose an inexact parallel BCD approach where at each iteration, a subset of the variables is updated in parallel by minimizing convex approximations of the original objective function. We investigate the convergence of this parallel BCD method for both randomized and cyclic variable selection rules. We analyze the asymptotic and non-asymptotic convergence behavior of the algorithm for both convex and non-convex objective functions. The numerical experiments suggest that for a special case of Lasso minimization problem, the cyclic block selection rule can outperform the randomized rule

    Structure-Aware Dynamic Scheduler for Parallel Machine Learning

    Full text link
    Training large machine learning (ML) models with many variables or parameters can take a long time if one employs sequential procedures even with stochastic updates. A natural solution is to turn to distributed computing on a cluster; however, naive, unstructured parallelization of ML algorithms does not usually lead to a proportional speedup and can even result in divergence, because dependencies between model elements can attenuate the computational gains from parallelization and compromise correctness of inference. Recent efforts toward this issue have benefited from exploiting the static, a priori block structures residing in ML algorithms. In this paper, we take this path further by exploring the dynamic block structures and workloads therein present during ML program execution, which offers new opportunities for improving convergence, correctness, and load balancing in distributed ML. We propose and showcase a general-purpose scheduler, STRADS, for coordinating distributed updates in ML algorithms, which harnesses the aforementioned opportunities in a systematic way. We provide theoretical guarantees for our scheduler, and demonstrate its efficacy versus static block structures on Lasso and Matrix Factorization

    Hybrid Random/Deterministic Parallel Algorithms for Nonconvex Big Data Optimization

    Full text link
    We propose a decomposition framework for the parallel optimization of the sum of a differentiable {(possibly nonconvex)} function and a nonsmooth (possibly nonseparable), convex one. The latter term is usually employed to enforce structure in the solution, typically sparsity. The main contribution of this work is a novel \emph{parallel, hybrid random/deterministic} decomposition scheme wherein, at each iteration, a subset of (block) variables is updated at the same time by minimizing local convex approximations of the original nonconvex function. To tackle with huge-scale problems, the (block) variables to be updated are chosen according to a \emph{mixed random and deterministic} procedure, which captures the advantages of both pure deterministic and random update-based schemes. Almost sure convergence of the proposed scheme is established. Numerical results show that on huge-scale problems the proposed hybrid random/deterministic algorithm outperforms both random and deterministic schemes.Comment: The order of the authors is alphabetica
    corecore