413 research outputs found
HALO 1.0: A Hardware-agnostic Accelerator Orchestration Framework for Enabling Hardware-agnostic Programming with True Performance Portability for Heterogeneous HPC
This paper presents HALO 1.0, an open-ended extensible multi-agent software
framework that implements a set of proposed hardware-agnostic accelerator
orchestration (HALO) principles. HALO implements a novel compute-centric
message passing interface (C^2MPI) specification for enabling the
performance-portable execution of a hardware-agnostic host application across
heterogeneous accelerators. The experiment results of evaluating eight widely
used HPC subroutines based on Intel Xeon E5-2620 CPUs, Intel Arria 10 GX FPGAs,
and NVIDIA GeForce RTX 2080 Ti GPUs show that HALO 1.0 allows for a unified
control flow for host programs to run across all the computing devices with a
consistently top performance portability score, which is up to five orders of
magnitude higher than the OpenCL-based solution.Comment: 21 page
Morpheus unleashed: Fast cross-platform SpMV on emerging architectures
Sparse matrices and linear algebra are at the heart of scientific
simulations. Over the years, more than 70 sparse matrix storage formats have
been developed, targeting a wide range of hardware architectures and matrix
types, each of which exploit the particular strengths of an architecture, or
the specific sparsity patterns of the matrices.
In this work, we explore the suitability of storage formats such as COO, CSR
and DIA for emerging architectures such as AArch64 CPUs and FPGAs. In addition,
we detail hardware-specific optimisations to these targets and evaluate the
potential of each contribution to be integrated into Morpheus, a modern library
that provides an abstraction of sparse matrices (currently) across x86 CPUs and
NVIDIA/AMD GPUs. Finally, we validate our work by comparing the performance of
the Morpheus-enabled HPCG benchmark against vendor-optimised implementations
A Survey of Processing Systems for Phylogenetics and Population Genetics
The COVID-19 pandemic brought Bioinformatics into the spotlight, revealing that several existing methods, algorithms, and tools were not well prepared to handle large amounts of genomic data efficiently. This led to prohibitively long execution times and the need to reduce the extent of analyses to obtain results in a reasonable amount of time. In this survey, we review available high-performance computing and hardware-accelerated systems based on FPGA and GPU technology. Optimized and hardware-accelerated systems can conduct more thorough analyses considerably faster than pure software implementations, allowing to reach important conclusions in a timely manner to drive scientific discoveries. We discuss the reasons that are currently hindering high-performance solutions from being widely deployed in real-world biological analyses and describe a research direction that can pave the way to enable this
- …