17,970 research outputs found

    Sampling Geometric Inhomogeneous Random Graphs in Linear Time

    Get PDF
    Real-world networks, like social networks or the internet infrastructure, have structural properties such as large clustering coefficients that can best be described in terms of an underlying geometry. This is why the focus of the literature on theoretical models for real-world networks shifted from classic models without geometry, such as Chung-Lu random graphs, to modern geometry-based models, such as hyperbolic random graphs. With this paper we contribute to the theoretical analysis of these modern, more realistic random graph models. Instead of studying directly hyperbolic random graphs, we use a generalization that we call geometric inhomogeneous random graphs (GIRGs). Since we ignore constant factors in the edge probabilities, GIRGs are technically simpler (specifically, we avoid hyperbolic cosines), while preserving the qualitative behaviour of hyperbolic random graphs, and we suggest to replace hyperbolic random graphs by this new model in future theoretical studies. We prove the following fundamental structural and algorithmic results on GIRGs. (1) As our main contribution we provide a sampling algorithm that generates a random graph from our model in expected linear time, improving the best-known sampling algorithm for hyperbolic random graphs by a substantial factor O(n^0.5). (2) We establish that GIRGs have clustering coefficients in {\Omega}(1), (3) we prove that GIRGs have small separators, i.e., it suffices to delete a sublinear number of edges to break the giant component into two large pieces, and (4) we show how to compress GIRGs using an expected linear number of bits.Comment: 25 page

    Distributed Machine Learning via Sufficient Factor Broadcasting

    Full text link
    Matrix-parametrized models, including multiclass logistic regression and sparse coding, are used in machine learning (ML) applications ranging from computer vision to computational biology. When these models are applied to large-scale ML problems starting at millions of samples and tens of thousands of classes, their parameter matrix can grow at an unexpected rate, resulting in high parameter synchronization costs that greatly slow down distributed learning. To address this issue, we propose a Sufficient Factor Broadcasting (SFB) computation model for efficient distributed learning of a large family of matrix-parameterized models, which share the following property: the parameter update computed on each data sample is a rank-1 matrix, i.e., the outer product of two "sufficient factors" (SFs). By broadcasting the SFs among worker machines and reconstructing the update matrices locally at each worker, SFB improves communication efficiency --- communication costs are linear in the parameter matrix's dimensions, rather than quadratic --- without affecting computational correctness. We present a theoretical convergence analysis of SFB, and empirically corroborate its efficiency on four different matrix-parametrized ML models

    A Riemannian low-rank method for optimization over semidefinite matrices with block-diagonal constraints

    Get PDF
    We propose a new algorithm to solve optimization problems of the form minf(X)\min f(X) for a smooth function ff under the constraints that XX is positive semidefinite and the diagonal blocks of XX are small identity matrices. Such problems often arise as the result of relaxing a rank constraint (lifting). In particular, many estimation tasks involving phases, rotations, orthonormal bases or permutations fit in this framework, and so do certain relaxations of combinatorial problems such as Max-Cut. The proposed algorithm exploits the facts that (1) such formulations admit low-rank solutions, and (2) their rank-restricted versions are smooth optimization problems on a Riemannian manifold. Combining insights from both the Riemannian and the convex geometries of the problem, we characterize when second-order critical points of the smooth problem reveal KKT points of the semidefinite problem. We compare against state of the art, mature software and find that, on certain interesting problem instances, what we call the staircase method is orders of magnitude faster, is more accurate and scales better. Code is available.Comment: 37 pages, 3 figure

    Fast Distributed Algorithms for LP-Type Problems of Bounded Dimension

    Full text link
    In this paper we present various distributed algorithms for LP-type problems in the well-known gossip model. LP-type problems include many important classes of problems such as (integer) linear programming, geometric problems like smallest enclosing ball and polytope distance, and set problems like hitting set and set cover. In the gossip model, a node can only push information to or pull information from nodes chosen uniformly at random. Protocols for the gossip model are usually very practical due to their fast convergence, their simplicity, and their stability under stress and disruptions. Our algorithms are very efficient (logarithmic rounds or better with just polylogarithmic communication work per node per round) whenever the combinatorial dimension of the given LP-type problem is constant, even if the size of the given LP-type problem is polynomially large in the number of nodes

    Hybrid Behaviour of Markov Population Models

    Full text link
    We investigate the behaviour of population models written in Stochastic Concurrent Constraint Programming (sCCP), a stochastic extension of Concurrent Constraint Programming. In particular, we focus on models from which we can define a semantics of sCCP both in terms of Continuous Time Markov Chains (CTMC) and in terms of Stochastic Hybrid Systems, in which some populations are approximated continuously, while others are kept discrete. We will prove the correctness of the hybrid semantics from the point of view of the limiting behaviour of a sequence of models for increasing population size. More specifically, we prove that, under suitable regularity conditions, the sequence of CTMC constructed from sCCP programs for increasing population size converges to the hybrid system constructed by means of the hybrid semantics. We investigate in particular what happens for sCCP models in which some transitions are guarded by boolean predicates or in the presence of instantaneous transitions

    Randomized Solutions to Convex Programs with Multiple Chance Constraints

    Full text link
    The scenario-based optimization approach (`scenario approach') provides an intuitive way of approximating the solution to chance-constrained optimization programs, based on finding the optimal solution under a finite number of sampled outcomes of the uncertainty (`scenarios'). A key merit of this approach is that it neither assumes knowledge of the uncertainty set, as it is common in robust optimization, nor of its probability distribution, as it is usually required in stochastic optimization. Moreover, the scenario approach is computationally efficient as its solution is based on a deterministic optimization program that is canonically convex, even when the original chance-constrained problem is not. Recently, researchers have obtained theoretical foundations for the scenario approach, providing a direct link between the number of scenarios and bounds on the constraint violation probability. These bounds are tight in the general case of an uncertain optimization problem with a single chance constraint. However, this paper shows that these bounds can be improved in situations where the constraints have a limited `support rank', a new concept that is introduced for the first time. This property is typically found in a large number of practical applications---most importantly, if the problem originally contains multiple chance constraints (e.g. multi-stage uncertain decision problems), or if a chance constraint belongs to a special class of constraints (e.g. linear or quadratic constraints). In these cases the quality of the scenario solution is improved while the same bound on the constraint violation probability is maintained, and also the computational complexity is reduced.Comment: This manuscript is the preprint of a paper submitted to the SIAM Journal on Optimization and it is subject to SIAM copyright. SIAM maintains the sole rights of distribution or publication of the work in all forms and media. If accepted, the copy of record will be available at http://www.siam.or
    corecore