84,645 research outputs found
A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems
Among the algorithms that are likely to play a major role in future exascale
computing, the fast multipole method (FMM) appears as a rising star. Our
previous recent work showed scaling of an FMM on GPU clusters, with problem
sizes in the order of billions of unknowns. That work led to an extremely
parallel FMM, scaling to thousands of GPUs or tens of thousands of CPUs. This
paper reports on a a campaign of performance tuning and scalability studies
using multi-core CPUs, on the Kraken supercomputer. All kernels in the FMM were
parallelized using OpenMP, and a test using 10^7 particles randomly distributed
in a cube showed 78% efficiency on 8 threads. Tuning of the
particle-to-particle kernel using SIMD instructions resulted in 4x speed-up of
the overall algorithm on single-core tests with 10^3 - 10^7 particles. Parallel
scalability was studied in both strong and weak scaling. The strong scaling
test used 10^8 particles and resulted in 93% parallel efficiency on 2048
processes for the non-SIMD code and 54% for the SIMD-optimized code (which was
still 2x faster). The weak scaling test used 10^6 particles per process, and
resulted in 72% efficiency on 32,768 processes, with the largest calculation
taking about 40 seconds to evaluate more than 32 billion unknowns. This work
builds up evidence for our view that FMM is poised to play a leading role in
exascale computing, and we end the paper with a discussion of the features that
make it a particularly favorable algorithm for the emerging heterogeneous and
massively parallel architectural landscape
A Bayesian alternative to mutual information for the hierarchical clustering of dependent random variables
The use of mutual information as a similarity measure in agglomerative
hierarchical clustering (AHC) raises an important issue: some correction needs
to be applied for the dimensionality of variables. In this work, we formulate
the decision of merging dependent multivariate normal variables in an AHC
procedure as a Bayesian model comparison. We found that the Bayesian
formulation naturally shrinks the empirical covariance matrix towards a matrix
set a priori (e.g., the identity), provides an automated stopping rule, and
corrects for dimensionality using a term that scales up the measure as a
function of the dimensionality of the variables. Also, the resulting log Bayes
factor is asymptotically proportional to the plug-in estimate of mutual
information, with an additive correction for dimensionality in agreement with
the Bayesian information criterion. We investigated the behavior of these
Bayesian alternatives (in exact and asymptotic forms) to mutual information on
simulated and real data. An encouraging result was first derived on
simulations: the hierarchical clustering based on the log Bayes factor
outperformed off-the-shelf clustering techniques as well as raw and normalized
mutual information in terms of classification accuracy. On a toy example, we
found that the Bayesian approaches led to results that were similar to those of
mutual information clustering techniques, with the advantage of an automated
thresholding. On real functional magnetic resonance imaging (fMRI) datasets
measuring brain activity, it identified clusters consistent with the
established outcome of standard procedures. On this application, normalized
mutual information had a highly atypical behavior, in the sense that it
systematically favored very large clusters. These initial experiments suggest
that the proposed Bayesian alternatives to mutual information are a useful new
tool for hierarchical clustering
Personalizing gesture recognition using hierarchical bayesian neural networks
Building robust classifiers trained on data susceptible to group or subject-specific variations is a challenging pattern recognition problem. We develop hierarchical Bayesian neural networks to capture subject-specific variations and share statistical strength across subjects. Leveraging recent work on learning Bayesian neural networks, we build fast, scalable algorithms for inferring the posterior distribution over all network weights in the hierarchy. We also develop methods for adapting our model to new subjects when a small number of subject-specific personalization data is available. Finally, we investigate active learning algorithms for interactively labeling personalization data in resource-constrained scenarios. Focusing on the problem of gesture recognition where inter-subject variations are commonplace, we demonstrate the effectiveness of our proposed techniques. We test our framework on three widely used gesture recognition datasets, achieving personalization performance competitive with the state-of-the-art.http://openaccess.thecvf.com/content_cvpr_2017/html/Joshi_Personalizing_Gesture_Recognition_CVPR_2017_paper.htmlhttp://openaccess.thecvf.com/content_cvpr_2017/html/Joshi_Personalizing_Gesture_Recognition_CVPR_2017_paper.htmlhttp://openaccess.thecvf.com/content_cvpr_2017/html/Joshi_Personalizing_Gesture_Recognition_CVPR_2017_paper.htmlPublished versio
Direct N-body Simulations
Special high-accuracy direct force summation N-body algorithms and their
relevance for the simulation of the dynamical evolution of star clusters and
other gravitating N-body systems in astrophysics are presented, explained and
compared with other methods. Other methods means here approximate physical
models based on the Fokker-Planck equation as well as other, approximate
algorithms to compute the gravitational potential in N-body systems. Questions
regarding the parallel implementation of direct ``brute force'' N-body codes
are discussed. The astrophysical application of the models to the theory of
relaxing rotating and non-rotating collisional star clusters is presented,
briefly mentioning the questions of the validity of the Fokker-Planck
approximation, the existence of gravothermal oscillations and of rotation and
primordial binaries.Comment: 32 pages, 13 figures, in press in Riffert, H., Werner K. (eds),
Computational Astrophysics, The Journal of Computational and Applied
Mathematics (JCAM), Elsevier Press, Amsterdam, 199
- …