Search CORE

5,742 research outputs found

Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation

Author: Ali Alnur
Azad Ariful
Buluc Aydin
Koanantakool Penporn
Morozov Dmitriy
Oh Sang-Yun
Oliker Leonid
Yelick Katherine
Publication venue
Publication date: 01/01/2018
Field of study

Across a variety of scientific disciplines, sparse inverse covariance estimation is a popular tool for capturing the underlying dependency relationships in multivariate data. Unfortunately, most estimators are not scalable enough to handle the sizes of modern high-dimensional data sets (often on the order of terabytes), and assume Gaussian samples. To address these deficiencies, we introduce HP-CONCORD, a highly scalable optimization method for estimating a sparse inverse covariance matrix based on a regularized pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal gradient method uses a novel communication-avoiding linear algebra algorithm and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving parallel scalability on problems with up to ~819 billion parameters (1.28 million dimensions); even on a single node, HP-CONCORD demonstrates scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to estimate the underlying dependency structure of the brain from fMRI data, and use the result to identify functional regions automatically. The results show good agreement with a clustering from the neuroscience literature.Comment: Main paper: 15 pages, appendix: 24 page

arXiv.org e-Print Archive

eScholarship - University of California

Representation and generation of plans using graph spectra

Author: Hanna S.
Publication venue: Istanbul Technical University
Publication date: 01/01/2007
Field of study

Numerical comparison of spaces with one another is often achieved with set scalar measures such as global and local integration, connectivity, etc., which capture a particular quality of the space but therefore lose much of the detail of its overall structure. More detailed methods such as graph edit distance are difficult to calculate, particularly for large plans. This paper proposes the use of the graph spectrum, or the ordered eigenvalues of a graph adjacency matrix, as a means to characterise the space as a whole. The result is a vector of high dimensionality that can be easily measured against others for detailed comparison. Several graph types are investigated, including boundary and axial representations, as are several methods for deriving the spectral vector. The effectiveness of these is evaluated using a genetic algorithm optimisation to generate plans to match a given spectrum, and evolution is seen to produce plans similar to the initial targets, even in very large search spaces. Results indicate that boundary graphs alone can capture the gross topological qualities of a space, but axial graphs are needed to indicate local relationships. Methods of scaling the spectra are investigated in relation to both global local changes to plan arrangement. For all graph types, the spectra were seen to capture local patterns of spatial arrangement even as global size is varied

Accelerating Monte Carlo simulations with an NVIDIA® graphics processor

Author: Blaschke Johannes
Jordan Robert
Künnemeyer Rainer
Martinsen Paul
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Modern graphics cards, commonly used in desktop computers, have evolved beyond a simple interface between processor and display to incorporate sophisticated calculation engines that can be applied to general purpose computing. The Monte Carlo algorithm for modelling photon transport in turbid media has been implemented on an NVIDIA® 8800gt graphics card using the CUDA toolkit. The Monte Carlo method relies on following the trajectory of millions of photons through the sample, often taking hours or days to complete. The graphics-processor implementation, processing roughly 110 million scattering events per second, was found to run more than 70 times faster than a similar, single-threaded implementation on a 2.67 GHz desktop computer