981 research outputs found
A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids
The efficient implementation of collective communiction operations has
received much attention. Initial efforts produced "optimal" trees based on
network communication models that assumed equal point-to-point latencies
between any two processes. This assumption is violated in most practical
settings, however, particularly in heterogeneous systems such as clusters of
SMPs and wide-area "computational Grids," with the result that collective
operations perform suboptimally. In response, more recent work has focused on
creating topology-aware trees for collective operations that minimize
communication across slower channels (e.g., a wide-area network). While these
efforts have significant communication benefits, they all limit their view of
the network to only two layers. We present a strategy based upon a multilayer
view of the network. By creating multilevel topology-aware trees we take
advantage of communication cost differences at every level in the network. We
used this strategy to implement topology-aware versions of several MPI
collective operations in MPICH-G2, the Globus Toolkit[tm]-enabled version of
the popular MPICH implementation of the MPI standard. Using information about
topology provided by MPICH-G2, we construct these multilevel topology-aware
trees automatically during execution. We present results demonstrating the
advantages of our multilevel approach by comparing it to the default
(topology-unaware) implementation provided by MPICH and a topology-aware
two-layer implementation.Comment: 16 pages, 8 figure
Quantum Monte Carlo Study of Strongly Correlated Electrons: Cellular Dynamical Mean-Field Theory
We study the Hubbard model using the Cellular Dynamical Mean-Field Theory
(CDMFT) with quantum Monte Carlo (QMC) simulations. We present the algorithmic
details of CDMFT with the Hirsch-Fye QMC method for the solution of the
self-consistently embedded quantum cluster problem. We use the one- and
two-dimensional half-filled Hubbard model to gauge the performance of CDMFT+QMC
particularly for small clusters by comparing with the exact results and also
with other quantum cluster methods. We calculate single-particle Green's
functions and self-energies on small clusters to study their size dependence in
one- and two-dimensions.Comment: 14 pages, 18 figure
Finding apparent horizons and other two-surfaces of constant expansion
Apparent horizons are structures of spacelike hypersurfaces that can be
determined locally in time. Closed surfaces of constant expansion (CE surfaces)
are a generalisation of apparent horizons. I present an efficient method for
locating CE surfaces. This method uses an explicit representation of the
surface, allowing for arbitrary resolutions and, in principle, shapes. The CE
surface equation is then solved as a nonlinear elliptic equation.
It is reasonable to assume that CE surfaces foliate a spacelike hypersurface
outside of some interior region, thus defining an invariant (but still
slicing-dependent) radial coordinate. This can be used to determine gauge modes
and to compare time evolutions with different gauge conditions. CE surfaces
also provide an efficient way to find new apparent horizons as they appear e.g.
in binary black hole simulations.Comment: 21 pages, 8 figures; two references adde
Recommended from our members
PETSc 2.0 Users Manual: Revision 2.0.16
This manual describes the use of PETSc 2.0 for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc 2.0 uses the MPI standard for all message-passing communication. PETSc includes an expanding suite of parallel linear and nonlinear equation solvers that may be used in application codes written in Fortran, C, and C++. PETSc provides many of the mechanisms needed thin parallel application codes, such as simple parallel matrix and vector assembly routines that allow the overlap of communication and computation. In addition, PETSc includes growing support for distributed arrays. The library is organized hierarchically, enabling users to employ the level of abstraction that is most appropriate for a particular problem. By using techniques of object-oriented programming, PETSc provides enormous flexibility for users. PETSc is a sophisticated set of software tools; as such, for some users it initially has a much steeper learning curve than a simple subroutine library. In particular, for individuals without some computer science background or experience programming in C, Pascal, or C++, it may require a large amount of time to take full advantage of the features that enable efficient software use. However, the power of the PETSc design and the algorithms it incorporates make the efficient implementation of many application codes much simpler than rolling them yourself. For many simple tasks a package such as Matlab is often the best tool; PETSc is not intended for the classes of problems for which effective Matlab code can be written. Since PETSc is still under development, small changes in usage and calling sequences of PETSc routines will continue to occur
New, efficient, and accurate high order derivative and dissipation operators satisfying summation by parts, and applications in three-dimensional multi-block evolutions
We construct new, efficient, and accurate high-order finite differencing
operators which satisfy summation by parts. Since these operators are not
uniquely defined, we consider several optimization criteria: minimizing the
bandwidth, the truncation error on the boundary points, the spectral radius, or
a combination of these. We examine in detail a set of operators that are up to
tenth order accurate in the interior, and we surprisingly find that a
combination of these optimizations can improve the operators' spectral radius
and accuracy by orders of magnitude in certain cases. We also construct
high-order dissipation operators that are compatible with these new finite
difference operators and which are semi-definite with respect to the
appropriate summation by parts scalar product. We test the stability and
accuracy of these new difference and dissipation operators by evolving a
three-dimensional scalar wave equation on a spherical domain consisting of
seven blocks, each discretized with a structured grid, and connected through
penalty boundary conditions.Comment: 16 pages, 9 figures. The files with the coefficients for the
derivative and dissipation operators can be accessed by downloading the
source code for the document. The files are located in the "coeffs"
subdirector
Recommended from our members
Improving the Performance of Tensor Matrix Vector Multiplication in Quantum Chemistry Codes.
Cumulative reaction probability (CRP) calculations provide a viable computational approach to estimate reaction rate coefficients. However, in order to give meaningful results these calculations should be done in many dimensions (ten to fifteen). This makes CRP codes memory intensive. For this reason, these codes use iterative methods to solve the linear systems, where a good fraction of the execution time is spent on matrix-vector multiplication. In this paper, we discuss the tensor product form of applying the system operator on a vector. This approach shows much better performance and provides huge savings in memory as compared to the explicit sparse representation of the system matrix
Recommended from our members
Simplified Linear Equation Solvers users manual
The solution of large sparse systems of linear equations is at the heart of many algorithms in scientific computing. The SLES package is a set of easy-to-use yet powerful and extensible routines for solving large sparse linear systems. The design of the package allows new techniques to be used in existing applications without any source code changes in the applications
- …