26,300 research outputs found
Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation
Across a variety of scientific disciplines, sparse inverse covariance
estimation is a popular tool for capturing the underlying dependency
relationships in multivariate data. Unfortunately, most estimators are not
scalable enough to handle the sizes of modern high-dimensional data sets (often
on the order of terabytes), and assume Gaussian samples. To address these
deficiencies, we introduce HP-CONCORD, a highly scalable optimization method
for estimating a sparse inverse covariance matrix based on a regularized
pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal
gradient method uses a novel communication-avoiding linear algebra algorithm
and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving
parallel scalability on problems with up to ~819 billion parameters (1.28
million dimensions); even on a single node, HP-CONCORD demonstrates
scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to
estimate the underlying dependency structure of the brain from fMRI data, and
use the result to identify functional regions automatically. The results show
good agreement with a clustering from the neuroscience literature.Comment: Main paper: 15 pages, appendix: 24 page
Clustering Via Nonparametric Density Estimation: the R Package pdfCluster
The R package pdfCluster performs cluster analysis based on a nonparametric
estimate of the density of the observed variables. After summarizing the main
aspects of the methodology, we describe the features and the usage of the
package, and finally illustrate its working with the aid of two datasets
Evolving structures of star-forming clusters
Understanding the formation and evolution of young star clusters requires
quantitative statistical measures of their structure. We investigate the
structures of observed and modelled star-forming clusters. By considering the
different evolutionary classes in the observations and the temporal evolution
in models of gravoturbulent fragmentation, we study the temporal evolution of
the cluster structures. We apply different statistical methods, in particular
the normalised mean correlation length and the minimum spanning tree technique.
We refine the normalisation of the clustering parameters by defining the area
using the normalised convex hull of the objects and investigate the effect of
two-dimensional projection of three-dimensional clusters. We introduce a new
measure for the elongation of a cluster. It is defined as the ratio of
the cluster radius determined by an enclosing circle to the cluster radius
derived from the normalised convex hull. The mean separation of young stars
increases with the evolutionary class, reflecting the expansion of the cluster.
The clustering parameters of the model clusters correspond in many cases well
to those from observed ones, especially when the values are similar. No
correlation of the clustering parameters with the turbulent environment of the
molecular cloud is found, indicating that possible influences of the
environment on the clustering behaviour are quickly smoothed out by the stellar
velocity dispersion. The temporal evolution of the clustering parameters shows
that the star cluster builds up from several subclusters and evolves to a more
centrally concentrated cluster, while the cluster expands slower than new stars
are formed.Comment: 11 pages, 10 figures, accepted by A&A; slightly modified according to
the referee repor
- …