630 research outputs found
ASCR/HEP Exascale Requirements Review Report
This draft report summarizes and details the findings, results, and
recommendations derived from the ASCR/HEP Exascale Requirements Review meeting
held in June, 2015. The main conclusions are as follows. 1) Larger, more
capable computing and data facilities are needed to support HEP science goals
in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of
the demand at the 2025 timescale is at least two orders of magnitude -- and in
some cases greater -- than that available currently. 2) The growth rate of data
produced by simulations is overwhelming the current ability, of both facilities
and researchers, to store and analyze it. Additional resources and new
techniques for data analysis are urgently needed. 3) Data rates and volumes
from HEP experimental facilities are also straining the ability to store and
analyze large and complex data volumes. Appropriately configured
leadership-class facilities can play a transformational role in enabling
scientific discovery from these datasets. 4) A close integration of HPC
simulation and data analysis will aid greatly in interpreting results from HEP
experiments. Such an integration will minimize data movement and facilitate
interdependent workflows. 5) Long-range planning between HEP and ASCR will be
required to meet HEP's research needs. To best use ASCR HPC resources the
experimental HEP program needs a) an established long-term plan for access to
ASCR computational and data resources, b) an ability to map workflows onto HPC
resources, c) the ability for ASCR facilities to accommodate workflows run by
collaborations that can have thousands of individual members, d) to transition
codes to the next-generation HPC platforms that will be available at ASCR
facilities, e) to build up and train a workforce capable of developing and
using simulations and analysis to support HEP scientific research on
next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio
Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics
The diversity of workload requirements and increasing hardware heterogeneity
in emerging high performance computing (HPC) systems motivate resource
disaggregation. Resource disaggregation allows compute and memory resources to
be allocated individually as required to each workload. However, it is unclear
how to efficiently realize this capability and cost-effectively meet the
stringent bandwidth and latency requirements of HPC applications. To that end,
we describe how modern photonics can be co-designed with modern HPC racks to
implement flexible intra-rack resource disaggregation and fully meet the bit
error rate (BER) and high escape bandwidth of all chip types in modern HPC
racks. Our photonic-based disaggregated rack provides an average application
speedup of 11% (46% maximum) for 25 CPU and 61% for 24 GPU benchmarks compared
to a similar system that instead uses modern electronic switches for
disaggregation. Using observed resource usage from a production system, we
estimate that an iso-performance intra-rack disaggregated HPC system using
photonics would require 4x fewer memory modules and 2x fewer NICs than a
non-disaggregated baseline.Comment: 15 pages, 12 figures, 4 tables. Published in IEEE Cluster 202
High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm
We implement a master-slave parallel genetic algorithm (PGA) with a bespoke
log-likelihood fitness function to identify emergent clusters within price
evolutions. We use graphics processing units (GPUs) to implement a PGA and
visualise the results using disjoint minimal spanning trees (MSTs). We
demonstrate that our GPU PGA, implemented on a commercially available general
purpose GPU, is able to recover stock clusters in sub-second speed, based on a
subset of stocks in the South African market. This represents a pragmatic
choice for low-cost, scalable parallel computing and is significantly faster
than a prototype serial implementation in an optimised C-based
fourth-generation programming language, although the results are not directly
comparable due to compiler differences. Combined with fast online intraday
correlation matrix estimation from high frequency data for cluster
identification, the proposed implementation offers cost-effective,
near-real-time risk assessment for financial practitioners.Comment: 10 pages, 5 figures, 4 tables, More thorough discussion of
implementatio
Energy Concerns with HPC Systems and Applications
For various reasons including those related to climate changes, {\em energy}
has become a critical concern in all relevant activities and technical designs.
For the specific case of computer activities, the problem is exacerbated with
the emergence and pervasiveness of the so called {\em intelligent devices}.
From the application side, we point out the special topic of {\em Artificial
Intelligence}, who clearly needs an efficient computing support in order to
succeed in its purpose of being a {\em ubiquitous assistant}. There are mainly
two contexts where {\em energy} is one of the top priority concerns: {\em
embedded computing} and {\em supercomputing}. For the former, power consumption
is critical because the amount of energy that is available for the devices is
limited. For the latter, the heat dissipated is a serious source of failure and
the financial cost related to energy is likely to be a significant part of the
maintenance budget. On a single computer, the problem is commonly considered
through the electrical power consumption. This paper, written in the form of a
survey, we depict the landscape of energy concerns in computer activities, both
from the hardware and the software standpoints.Comment: 20 page
Analyzing Resource Utilization in an HPC System: A Case Study of NERSC Perlmutter
Resource demands of HPC applications vary significantly. However, it is
common for HPC systems to primarily assign resources on a per-node basis to
prevent interference from co-located workloads. This gap between the
coarse-grained resource allocation and the varying resource demands can lead to
HPC resources being not fully utilized. In this study, we analyze the resource
usage and application behavior of NERSC's Perlmutter, a state-of-the-art
open-science HPC system with both CPU-only and GPU-accelerated nodes. Our
one-month usage analysis reveals that CPUs are commonly not fully utilized,
especially for GPU-enabled jobs. Also, around 64% of both CPU and GPU-enabled
jobs used 50% or less of the available host memory capacity. Additionally,
about 50% of GPU-enabled jobs used up to 25% of the GPU memory, and the memory
capacity was not fully utilized in some ways for all jobs. While our study
comes early in Perlmutter's lifetime thus policies and application workload may
change, it provides valuable insights on performance characterization,
application behavior, and motivates systems with more fine-grain resource
allocation
- …