240,767 research outputs found
Automated characterization of spatial and dynamical heterogeneity in supercooled liquids via implementation of Machine Learning
A computational approach by an implementation of the Principle Component
Analysis (PCA) with K-means and Gaussian Mixture (GM) clustering methods from
Machine Learning (ML) algorithms to identify structural and dynamical
heterogeneities of supercooled liquids is developed. In this method, a
collection of the average weighted coordination numbers () of
particles calculated from particles' positions are used as an order parameter
to build a low-dimensional representation of feature (structural) space for
K-means clustering to sort the particles in the system into few meso-states
using PCA. Nano-domains or aggregated clusters are also formed in
configurational (real) space from a direct mapping using associated
meso-states' particle identities with some misclassified interfacial particles.
These classification uncertainties can be improved by a co-learning strategy
which utilizes the probabilistic GM clustering and the information transfer
between the structural space and configurational space iteratively until
convergence. A final classification of meso-states in structural space and
domains in configurational space are stable over long times and measured to
have dynamical heterogeneities. Armed with such a classification protocol,
various studies over the thermodynamic and dynamical properties of these
domains indicate that the observed heterogeneity is the result of liquid-liquid
phase separation after quenching to a supercooled state
Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates
Abstract-Co-clustering has emerged to be a powerful data mining tool for two-dimensional co-occurrence and dyadic data. However, co-clustering algorithms often require significant computational resources and have been dismissed as impractical for large data sets. Existing studies have provided strong empirical evidence that expectation-maximization (EM) algorithms (e.g., k-means algorithm) with sequential updates can significantly reduce the computational cost without degrading the resulting solution. Motivated by this observation, we introduce sequential updates for alternate minimization co-clustering (AMCC) algorithms which are variants of EM algorithms, and also show that AMCC algorithms with sequential updates converge. We then propose two approaches to parallelize AMCC algorithms with sequential updates in a distributed environment. Both approaches are proved to maintain the convergence properties of AMCC algorithms. Based on these two approaches, we present a new distributed framework, Co-ClusterD, which supports efficient implementations of AMCC algorithms with sequential updates. We design and implement Co-ClusterD, and show its efficiency through two AMCC algorithms: fast nonnegative matrix tri-factorization (FNMTF) and information theoretic co-clustering (ITCC). We evaluate our framework on both a local cluster of machines and the Amazon EC2 cloud. Empirical results show that AMCC algorithms implemented in Co-ClusterD can achieve a much faster convergence and often obtain better results than their traditional concurrent counterparts
Asynchronous Parallel Stochastic Gradient Descent - A Numeric Core for Scalable Distributed Machine Learning Algorithms
The implementation of a vast majority of machine learning (ML) algorithms
boils down to solving a numerical optimization problem. In this context,
Stochastic Gradient Descent (SGD) methods have long proven to provide good
results, both in terms of convergence and accuracy. Recently, several
parallelization approaches have been proposed in order to scale SGD to solve
very large ML problems. At their core, most of these approaches are following a
map-reduce scheme. This paper presents a novel parallel updating algorithm for
SGD, which utilizes the asynchronous single-sided communication paradigm.
Compared to existing methods, Asynchronous Parallel Stochastic Gradient Descent
(ASGD) provides faster (or at least equal) convergence, close to linear scaling
and stable accuracy
Balancing the Communication Load of Asynchronously Parallelized Machine Learning Algorithms
Stochastic Gradient Descent (SGD) is the standard numerical method used to
solve the core optimization problem for the vast majority of machine learning
(ML) algorithms. In the context of large scale learning, as utilized by many
Big Data applications, efficient parallelization of SGD is in the focus of
active research. Recently, we were able to show that the asynchronous
communication paradigm can be applied to achieve a fast and scalable
parallelization of SGD. Asynchronous Stochastic Gradient Descent (ASGD)
outperforms other, mostly MapReduce based, parallel algorithms solving large
scale machine learning problems. In this paper, we investigate the impact of
asynchronous communication frequency and message size on the performance of
ASGD applied to large scale ML on HTC cluster and cloud environments. We
introduce a novel algorithm for the automatic balancing of the asynchronous
communication load, which allows to adapt ASGD to changing network bandwidths
and latencies.Comment: arXiv admin note: substantial text overlap with arXiv:1505.0495
Message-Passing Algorithms for Quadratic Minimization
Gaussian belief propagation (GaBP) is an iterative algorithm for computing
the mean of a multivariate Gaussian distribution, or equivalently, the minimum
of a multivariate positive definite quadratic function. Sufficient conditions,
such as walk-summability, that guarantee the convergence and correctness of
GaBP are known, but GaBP may fail to converge to the correct solution given an
arbitrary positive definite quadratic function. As was observed in previous
work, the GaBP algorithm fails to converge if the computation trees produced by
the algorithm are not positive definite. In this work, we will show that the
failure modes of the GaBP algorithm can be understood via graph covers, and we
prove that a parameterized generalization of the min-sum algorithm can be used
to ensure that the computation trees remain positive definite whenever the
input matrix is positive definite. We demonstrate that the resulting algorithm
is closely related to other iterative schemes for quadratic minimization such
as the Gauss-Seidel and Jacobi algorithms. Finally, we observe, empirically,
that there always exists a choice of parameters such that the above
generalization of the GaBP algorithm converges
Newton-Raphson Consensus for Distributed Convex Optimization
We address the problem of distributed uncon- strained convex optimization
under separability assumptions, i.e., the framework where each agent of a
network is endowed with a local private multidimensional convex cost, is
subject to communication constraints, and wants to collaborate to compute the
minimizer of the sum of the local costs. We propose a design methodology that
combines average consensus algorithms and separation of time-scales ideas. This
strategy is proved, under suitable hypotheses, to be globally convergent to the
true minimizer. Intuitively, the procedure lets the agents distributedly
compute and sequentially update an approximated Newton- Raphson direction by
means of suitable average consensus ratios. We show with numerical simulations
that the speed of convergence of this strategy is comparable with alternative
optimization strategies such as the Alternating Direction Method of
Multipliers. Finally, we propose some alternative strategies which trade-off
communication and computational requirements with convergence speed.Comment: 18 pages, preprint with proof
Stochastic optimization methods for the simultaneous control of parameter-dependent systems
We address the application of stochastic optimization methods for the
simultaneous control of parameter-dependent systems. In particular, we focus on
the classical Stochastic Gradient Descent (SGD) approach of Robbins and Monro,
and on the recently developed Continuous Stochastic Gradient (CSG) algorithm.
We consider the problem of computing simultaneous controls through the
minimization of a cost functional defined as the superposition of individual
costs for each realization of the system. We compare the performances of these
stochastic approaches, in terms of their computational complexity, with those
of the more classical Gradient Descent (GD) and Conjugate Gradient (CG)
algorithms, and we discuss the advantages and disadvantages of each
methodology. In agreement with well-established results in the machine learning
context, we show how the SGD and CSG algorithms can significantly reduce the
computational burden when treating control problems depending on a large amount
of parameters. This is corroborated by numerical experiments
- …