95,755 research outputs found
Inner product computation for sparse iterative solvers on\ud distributed supercomputer
Recent years have witnessed that iterative Krylov methods without re-designing are not suitable for distribute supercomputers because of intensive global communications. It is well accepted that re-engineering Krylov methods for prescribed computer architecture is necessary and important to achieve higher performance and scalability. The paper focuses on simple and practical ways to re-organize Krylov methods and improve their performance for current heterogeneous distributed supercomputers. In construct with most of current software development of Krylov methods which usually focuses on efficient matrix vector multiplications, the paper focuses on the way to compute inner products on supercomputers and explains why inner product computation on current heterogeneous distributed supercomputers is crucial for scalable Krylov methods. Communication complexity analysis shows that how the inner product computation can be the bottleneck of performance of (inner) product-type iterative solvers on distributed supercomputers due to global communications. Principles of reducing such global communications are discussed. The importance of minimizing communications is demonstrated by experiments using up to 900 processors. The experiments were carried on a Dawning 5000A, one of the fastest and earliest heterogeneous supercomputers in the world. Both the analysis and experiments indicates that inner product computation is very likely to be the most challenging kernel for inner product-based iterative solvers to achieve exascale
Minimizing synchronizations in sparse iterative solvers for distributed supercomputers
Eliminating synchronizations is one of the important techniques related to minimizing communications for modern high performance computing. This paper discusses principles of reducing communications due to global synchronizations in sparse iterative solvers on distributed supercomputers. We demonstrates how to minimizing global synchronizations by rescheduling a typical Krylov subspace method. The benefit of minimizing synchronizations is shown in theoretical analysis and is verified by numerical experiments using up to 900 processors. The experiments also show the communication complexity for some structured sparse matrix vector multiplications and global communications in the underlying supercomputers are in the order P1/2.5 and P4/5 respectively, where P is the number of processors and the experiments were carried on a Dawning 5000A
Global sensitivity analysis for stochastic simulators based on generalized lambda surrogate models
Global sensitivity analysis aims at quantifying the impact of input
variability onto the variation of the response of a computational model. It has
been widely applied to deterministic simulators, for which a set of input
parameters has a unique corresponding output value. Stochastic simulators,
however, have intrinsic randomness due to their use of (pseudo)random numbers,
so they give different results when run twice with the same input parameters
but non-common random numbers. Due to this random nature, conventional Sobol'
indices, used in global sensitivity analysis, can be extended to stochastic
simulators in different ways. In this paper, we discuss three possible
extensions and focus on those that depend only on the statistical dependence
between input and output. This choice ignores the detailed data generating
process involving the internal randomness, and can thus be applied to a wider
class of problems. We propose to use the generalized lambda model to emulate
the response distribution of stochastic simulators. Such a surrogate can be
constructed without the need for replications. The proposed method is applied
to three examples including two case studies in finance and epidemiology. The
results confirm the convergence of the approach for estimating the sensitivity
indices even with the presence of strong heteroskedasticity and small
signal-to-noise ratio
Local models of Shimura varieties and a conjecture of Kottwitz
We give a group theoretic definition of "local models" as sought after in the
theory of Shimura varieties. These are projective schemes over the integers of
a -adic local field that are expected to model the singularities of integral
models of Shimura varieties with parahoric level structure. Our local models
are certain mixed characteristic degenerations of Grassmannian varieties; they
are obtained by extending constructions of Beilinson, Drinfeld, Gaitsgory and
the second-named author to mixed characteristics and to the case of general
(tamely ramified) reductive groups. We study the singularities of local models
and hence also of the corresponding integral models of Shimura varieties. In
particular, we study the monodromy (inertia) action and show a commutativity
property for the sheaves of nearby cycles. As a result, we prove a conjecture
of Kottwitz which asserts that the semi-simple trace of Frobenius on the nearby
cycles gives a function which is central in the parahoric Hecke algebra.Comment: 88 pages, several corrections and change
Techniques of replica symmetry breaking and the storage problem of the McCulloch-Pitts neuron
In this article the framework for Parisi's spontaneous replica symmetry
breaking is reviewed, and subsequently applied to the example of the
statistical mechanical description of the storage properties of a
McCulloch-Pitts neuron. The technical details are reviewed extensively, with
regard to the wide range of systems where the method may be applied. Parisi's
partial differential equation and related differential equations are discussed,
and a Green function technique introduced for the calculation of replica
averages, the key to determining the averages of physical quantities. The
ensuing graph rules involve only tree graphs, as appropriate for a
mean-field-like model. The lowest order Ward-Takahashi identity is recovered
analytically and is shown to lead to the Goldstone modes in continuous replica
symmetry breaking phases. The need for a replica symmetry breaking theory in
the storage problem of the neuron has arisen due to the thermodynamical
instability of formerly given solutions. Variational forms for the neuron's
free energy are derived in terms of the order parameter function x(q), for
different prior distribution of synapses. Analytically in the high temperature
limit and numerically in generic cases various phases are identified, among
them one similar to the Parisi phase in the Sherrington-Kirkpatrick model.
Extensive quantities like the error per pattern change slightly with respect to
the known unstable solutions, but there is a significant difference in the
distribution of non-extensive quantities like the synaptic overlaps and the
pattern storage stability parameter. A simulation result is also reviewed and
compared to the prediction of the theory.Comment: 103 Latex pages (with REVTeX 3.0), including 15 figures (ps, epsi,
eepic), accepted for Physics Report
- …