233,182 research outputs found

    Automated characterization of spatial and dynamical heterogeneity in supercooled liquids via implementation of Machine Learning

    Full text link
    A computational approach by an implementation of the Principle Component Analysis (PCA) with K-means and Gaussian Mixture (GM) clustering methods from Machine Learning (ML) algorithms to identify structural and dynamical heterogeneities of supercooled liquids is developed. In this method, a collection of the average weighted coordination numbers (WCNs‾\overline{WCNs}) of particles calculated from particles' positions are used as an order parameter to build a low-dimensional representation of feature (structural) space for K-means clustering to sort the particles in the system into few meso-states using PCA. Nano-domains or aggregated clusters are also formed in configurational (real) space from a direct mapping using associated meso-states' particle identities with some misclassified interfacial particles. These classification uncertainties can be improved by a co-learning strategy which utilizes the probabilistic GM clustering and the information transfer between the structural space and configurational space iteratively until convergence. A final classification of meso-states in structural space and domains in configurational space are stable over long times and measured to have dynamical heterogeneities. Armed with such a classification protocol, various studies over the thermodynamic and dynamical properties of these domains indicate that the observed heterogeneity is the result of liquid-liquid phase separation after quenching to a supercooled state

    Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms

    Full text link
    The implementation of a vast majority of machine learning (ML) algorithms boils down to solving a numerical optimization problem. In this context, Stochastic Gradient Descent (SGD) methods have long proven to provide good results, both in terms of convergence and accuracy. Recently, several parallelization approaches have been proposed in order to scale SGD to solve very large ML problems. At their core, most of these approaches are following a map-reduce scheme. This paper presents a novel parallel updating algorithm for SGD, which utilizes the asynchronous single-sided communication paradigm. Compared to existing methods, Asynchronous Parallel Stochastic Gradient Descent (ASGD) provides faster (or at least equal) convergence, close to linear scaling and stable accuracy

    Balancing the Communication Load of Asynchronously Parallelized Machine Learning Algorithms

    Full text link
    Stochastic Gradient Descent (SGD) is the standard numerical method used to solve the core optimization problem for the vast majority of machine learning (ML) algorithms. In the context of large scale learning, as utilized by many Big Data applications, efficient parallelization of SGD is in the focus of active research. Recently, we were able to show that the asynchronous communication paradigm can be applied to achieve a fast and scalable parallelization of SGD. Asynchronous Stochastic Gradient Descent (ASGD) outperforms other, mostly MapReduce based, parallel algorithms solving large scale machine learning problems. In this paper, we investigate the impact of asynchronous communication frequency and message size on the performance of ASGD applied to large scale ML on HTC cluster and cloud environments. We introduce a novel algorithm for the automatic balancing of the asynchronous communication load, which allows to adapt ASGD to changing network bandwidths and latencies.Comment: arXiv admin note: substantial text overlap with arXiv:1505.0495

    Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates

    Get PDF
    Abstract-Co-clustering has emerged to be a powerful data mining tool for two-dimensional co-occurrence and dyadic data. However, co-clustering algorithms often require significant computational resources and have been dismissed as impractical for large data sets. Existing studies have provided strong empirical evidence that expectation-maximization (EM) algorithms (e.g., k-means algorithm) with sequential updates can significantly reduce the computational cost without degrading the resulting solution. Motivated by this observation, we introduce sequential updates for alternate minimization co-clustering (AMCC) algorithms which are variants of EM algorithms, and also show that AMCC algorithms with sequential updates converge. We then propose two approaches to parallelize AMCC algorithms with sequential updates in a distributed environment. Both approaches are proved to maintain the convergence properties of AMCC algorithms. Based on these two approaches, we present a new distributed framework, Co-ClusterD, which supports efficient implementations of AMCC algorithms with sequential updates. We design and implement Co-ClusterD, and show its efficiency through two AMCC algorithms: fast nonnegative matrix tri-factorization (FNMTF) and information theoretic co-clustering (ITCC). We evaluate our framework on both a local cluster of machines and the Amazon EC2 cloud. Empirical results show that AMCC algorithms implemented in Co-ClusterD can achieve a much faster convergence and often obtain better results than their traditional concurrent counterparts

    Message-Passing Algorithms for Quadratic Minimization

    Full text link
    Gaussian belief propagation (GaBP) is an iterative algorithm for computing the mean of a multivariate Gaussian distribution, or equivalently, the minimum of a multivariate positive definite quadratic function. Sufficient conditions, such as walk-summability, that guarantee the convergence and correctness of GaBP are known, but GaBP may fail to converge to the correct solution given an arbitrary positive definite quadratic function. As was observed in previous work, the GaBP algorithm fails to converge if the computation trees produced by the algorithm are not positive definite. In this work, we will show that the failure modes of the GaBP algorithm can be understood via graph covers, and we prove that a parameterized generalization of the min-sum algorithm can be used to ensure that the computation trees remain positive definite whenever the input matrix is positive definite. We demonstrate that the resulting algorithm is closely related to other iterative schemes for quadratic minimization such as the Gauss-Seidel and Jacobi algorithms. Finally, we observe, empirically, that there always exists a choice of parameters such that the above generalization of the GaBP algorithm converges

    Newton-Raphson Consensus for Distributed Convex Optimization

    Full text link
    We address the problem of distributed uncon- strained convex optimization under separability assumptions, i.e., the framework where each agent of a network is endowed with a local private multidimensional convex cost, is subject to communication constraints, and wants to collaborate to compute the minimizer of the sum of the local costs. We propose a design methodology that combines average consensus algorithms and separation of time-scales ideas. This strategy is proved, under suitable hypotheses, to be globally convergent to the true minimizer. Intuitively, the procedure lets the agents distributedly compute and sequentially update an approximated Newton- Raphson direction by means of suitable average consensus ratios. We show with numerical simulations that the speed of convergence of this strategy is comparable with alternative optimization strategies such as the Alternating Direction Method of Multipliers. Finally, we propose some alternative strategies which trade-off communication and computational requirements with convergence speed.Comment: 18 pages, preprint with proof

    Stochastic optimization methods for the simultaneous control of parameter-dependent systems

    Full text link
    We address the application of stochastic optimization methods for the simultaneous control of parameter-dependent systems. In particular, we focus on the classical Stochastic Gradient Descent (SGD) approach of Robbins and Monro, and on the recently developed Continuous Stochastic Gradient (CSG) algorithm. We consider the problem of computing simultaneous controls through the minimization of a cost functional defined as the superposition of individual costs for each realization of the system. We compare the performances of these stochastic approaches, in terms of their computational complexity, with those of the more classical Gradient Descent (GD) and Conjugate Gradient (CG) algorithms, and we discuss the advantages and disadvantages of each methodology. In agreement with well-established results in the machine learning context, we show how the SGD and CSG algorithms can significantly reduce the computational burden when treating control problems depending on a large amount of parameters. This is corroborated by numerical experiments
    • …
    corecore