23,740 research outputs found
Recommended from our members
A comparison of performance-enhancing strategies for parallel numerical object-oriented frameworks
Performance short of that of C or FORTRAN 77 is a significant obstacle to general acceptance of object-oriented C++ frameworks in high-performance parallel scientific computing, nonetheless, their value in simplifying complex computations is inarguable. Singular data points of good performance for object-oriented libraries/frameworks have been interesting, but a systematic analysis of the performance issues has not been done. This paper explores just a few of these issues and reports on the use of three mechanisms for enhancing the performance of object-oriented frameworks within numerical computation. The first is the commonly-use of binary overloaded operators (though implemented with substantial internal optimizations), the second is the use of expression templates, and the third is the use of an optimizing preprocessor. The first two have been completely implemented and are available within the A++/P++ array class library, the third, ROSE++, represents work in progress. This paper provides some perspective on the types of optimizations that the authors consider important within their numerical applications using OVERTURE involving complex geometry and AMR on parallel architectures
Automating embedded analysis capabilities and managing software complexity in multiphysics simulation part I: template-based generic programming
An approach for incorporating embedded simulation and analysis capabilities
in complex simulation codes through template-based generic programming is
presented. This approach relies on templating and operator overloading within
the C++ language to transform a given calculation into one that can compute a
variety of additional quantities that are necessary for many state-of-the-art
simulation and analysis algorithms. An approach for incorporating these ideas
into complex simulation codes through general graph-based assembly is also
presented. These ideas have been implemented within a set of packages in the
Trilinos framework and are demonstrated on a simple problem from chemical
engineering
Expression Templates Revisited: A Performance Analysis of the Current ET Methodology
In the last decade, Expression Templates (ET) have gained a reputation as an
efficient performance optimization tool for C++ codes. This reputation builds
on several ET-based linear algebra frameworks focused on combining both elegant
and high-performance C++ code. However, on closer examination the assumption
that ETs are a performance optimization technique cannot be maintained. In this
paper we demonstrate and explain the inability of current ET-based frameworks
to deliver high performance for dense and sparse linear algebra operations, and
introduce a new "smart" ET implementation that truly allows the combination of
high performance code with the elegance and maintainability of a
domain-specific language.Comment: 16 pages, 7 figure
The LifeV library: engineering mathematics beyond the proof of concept
LifeV is a library for the finite element (FE) solution of partial
differential equations in one, two, and three dimensions. It is written in C++
and designed to run on diverse parallel architectures, including cloud and high
performance computing facilities. In spite of its academic research nature,
meaning a library for the development and testing of new methods, one
distinguishing feature of LifeV is its use on real world problems and it is
intended to provide a tool for many engineering applications. It has been
actually used in computational hemodynamics, including cardiac mechanics and
fluid-structure interaction problems, in porous media, ice sheets dynamics for
both forward and inverse problems. In this paper we give a short overview of
the features of LifeV and its coding paradigms on simple problems. The main
focus is on the parallel environment which is mainly driven by domain
decomposition methods and based on external libraries such as MPI, the Trilinos
project, HDF5 and ParMetis.
Dedicated to the memory of Fausto Saleri.Comment: Review of the LifeV Finite Element librar
C++ Templates as Partial Evaluation
This paper explores the relationship between C++ templates and partial
evaluation. Templates were designed to support generic programming, but
unintentionally provided the ability to perform compile-time computations and
code generation. These features are completely accidental, and as a result
their syntax is awkward. By recasting these features in terms of partial
evaluation, a much simpler syntax can be achieved. C++ may be regarded as a
two-level language in which types are first-class values. Template
instantiation resembles an offline partial evaluator. This paper describes
preliminary work toward a single mechanism based on Partial Evaluation which
unifies generic programming, compile-time computation and code generation. The
language Catat is introduced to illustrate these ideas.Comment: 13 page
A Massive Data Parallel Computational Framework for Petascale/Exascale Hybrid Computer Systems
Heterogeneous systems are becoming more common on High Performance Computing
(HPC) systems. Even using tools like CUDA and OpenCL it is a non-trivial task
to obtain optimal performance on the GPU. Approaches to simplifying this task
include Merge (a library based framework for heterogeneous multi-core systems),
Zippy (a framework for parallel execution of codes on multiple GPUs), BSGP (a
new programming language for general purpose computation on the GPU) and
CUDA-lite (an enhancement to CUDA that transforms code based on annotations).
In addition, efforts are underway to improve compiler tools for automatic
parallelization and optimization of affine loop nests for GPUs and for
automatic translation of OpenMP parallelized codes to CUDA.
In this paper we present an alternative approach: a new computational
framework for the development of massively data parallel scientific codes
applications suitable for use on such petascale/exascale hybrid systems built
upon the highly scalable Cactus framework. As the first non-trivial
demonstration of its usefulness, we successfully developed a new 3D CFD code
that achieves improved performance.Comment: Parallel Computing 2011 (ParCo2011), 30 August -- 2 September 2011,
Ghent, Belgiu
- …