26,300 research outputs found

    Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation

    Full text link
    Across a variety of scientific disciplines, sparse inverse covariance estimation is a popular tool for capturing the underlying dependency relationships in multivariate data. Unfortunately, most estimators are not scalable enough to handle the sizes of modern high-dimensional data sets (often on the order of terabytes), and assume Gaussian samples. To address these deficiencies, we introduce HP-CONCORD, a highly scalable optimization method for estimating a sparse inverse covariance matrix based on a regularized pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal gradient method uses a novel communication-avoiding linear algebra algorithm and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving parallel scalability on problems with up to ~819 billion parameters (1.28 million dimensions); even on a single node, HP-CONCORD demonstrates scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to estimate the underlying dependency structure of the brain from fMRI data, and use the result to identify functional regions automatically. The results show good agreement with a clustering from the neuroscience literature.Comment: Main paper: 15 pages, appendix: 24 page

    Clustering Via Nonparametric Density Estimation: the R Package pdfCluster

    Get PDF
    The R package pdfCluster performs cluster analysis based on a nonparametric estimate of the density of the observed variables. After summarizing the main aspects of the methodology, we describe the features and the usage of the package, and finally illustrate its working with the aid of two datasets

    Evolving structures of star-forming clusters

    Full text link
    Understanding the formation and evolution of young star clusters requires quantitative statistical measures of their structure. We investigate the structures of observed and modelled star-forming clusters. By considering the different evolutionary classes in the observations and the temporal evolution in models of gravoturbulent fragmentation, we study the temporal evolution of the cluster structures. We apply different statistical methods, in particular the normalised mean correlation length and the minimum spanning tree technique. We refine the normalisation of the clustering parameters by defining the area using the normalised convex hull of the objects and investigate the effect of two-dimensional projection of three-dimensional clusters. We introduce a new measure Îľ\xi for the elongation of a cluster. It is defined as the ratio of the cluster radius determined by an enclosing circle to the cluster radius derived from the normalised convex hull. The mean separation of young stars increases with the evolutionary class, reflecting the expansion of the cluster. The clustering parameters of the model clusters correspond in many cases well to those from observed ones, especially when the Îľ\xi values are similar. No correlation of the clustering parameters with the turbulent environment of the molecular cloud is found, indicating that possible influences of the environment on the clustering behaviour are quickly smoothed out by the stellar velocity dispersion. The temporal evolution of the clustering parameters shows that the star cluster builds up from several subclusters and evolves to a more centrally concentrated cluster, while the cluster expands slower than new stars are formed.Comment: 11 pages, 10 figures, accepted by A&A; slightly modified according to the referee repor
    • …
    corecore