22,511 research outputs found
Performance and Optimization Abstractions for Large Scale Heterogeneous Systems in the Cactus/Chemora Framework
We describe a set of lower-level abstractions to improve performance on
modern large scale heterogeneous systems. These provide portable access to
system- and hardware-dependent features, automatically apply dynamic
optimizations at run time, and target stencil-based codes used in finite
differencing, finite volume, or block-structured adaptive mesh refinement
codes.
These abstractions include a novel data structure to manage refinement
information for block-structured adaptive mesh refinement, an iterator
mechanism to efficiently traverse multi-dimensional arrays in stencil-based
codes, and a portable API and implementation for explicit SIMD vectorization.
These abstractions can either be employed manually, or be targeted by
automated code generation, or be used via support libraries by compilers during
code generation. The implementations described below are available in the
Cactus framework, and are used e.g. in the Einstein Toolkit for relativistic
astrophysics simulations
A Language and Hardware Independent Approach to Quantum-Classical Computing
Heterogeneous high-performance computing (HPC) systems offer novel
architectures which accelerate specific workloads through judicious use of
specialized coprocessors. A promising architectural approach for future
scientific computations is provided by heterogeneous HPC systems integrating
quantum processing units (QPUs). To this end, we present XACC (eXtreme-scale
ACCelerator) --- a programming model and software framework that enables
quantum acceleration within standard or HPC software workflows. XACC follows a
coprocessor machine model that is independent of the underlying quantum
computing hardware, thereby enabling quantum programs to be defined and
executed on a variety of QPUs types through a unified application programming
interface. Moreover, XACC defines a polymorphic low-level intermediate
representation, and an extensible compiler frontend that enables language
independent quantum programming, thus promoting integration and
interoperability across the quantum programming landscape. In this work we
define the software architecture enabling our hardware and language independent
approach, and demonstrate its usefulness across a range of quantum computing
models through illustrative examples involving the compilation and execution of
gate and annealing-based quantum programs
Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis
Sympiler is a domain-specific code generator that optimizes sparse matrix
computations by decoupling the symbolic analysis phase from the numerical
manipulation stage in sparse codes. The computation patterns in sparse
numerical methods are guided by the input sparsity structure and the sparse
algorithm itself. In many real-world simulations, the sparsity pattern changes
little or not at all. Sympiler takes advantage of these properties to
symbolically analyze sparse codes at compile-time and to apply inspector-guided
transformations that enable applying low-level transformations to sparse codes.
As a result, the Sympiler-generated code outperforms highly-optimized matrix
factorization codes from commonly-used specialized libraries, obtaining average
speedups over Eigen and CHOLMOD of 3.8X and 1.5X respectively.Comment: 12 page
- …