2,517 research outputs found
The AXIOM software layers
AXIOM project aims at developing a heterogeneous computing board (SMP-FPGA).The Software Layers developed at the AXIOM project are explained.OmpSs provides an easy way to execute heterogeneous codes in multiple cores. People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed.The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).Peer ReviewedPostprint (author's final draft
Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters
Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. In particular, we show how the same analysis techniques can be applicable on different architectures, analyzing the same HPC application on a high-end and a low-power cluster. The former cluster embeds Intel Haswell CPUs and NVIDIA K80 GPUs, while the latter is made up of NVIDIA Jetson TX1 boards, each hosting an Arm Cortex-A57 CPU and an NVIDIA Tegra X1 Maxwell GPU.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc projects [17], grant agreements n. 288777, 610402 and 671697. E.C. was partially founded by “Contributo 5 per mille assegnato all’Università degli Studi di Ferrara-dichiarazione dei redditi dell’anno 2014”. We thank the University of Ferrara and INFN Ferrara for the access to the COKA Cluster. We warmly thank the BSC tools group, supporting us for the smooth integration and test of our setup within Extrae and Paraver.Peer ReviewedPostprint (published version
Supercomputing Frontiers
This open access book constitutes the refereed proceedings of the 7th Asian Conference Supercomputing Conference, SCFA 2022, which took place in Singapore in March 2022. The 8 full papers presented in this book were carefully reviewed and selected from 21 submissions. They cover a range of topics including file systems, memory hierarchy, HPC cloud platform, container image configuration workflow, large-scale applications, and scheduling
A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network
Novel low-diameter network topologies such as Slim Fly (SF) offer significant
cost and power advantages over the established Fat Tree, Clos, or Dragonfly. To
spearhead the adoption of low-diameter networks, we design, implement, deploy,
and evaluate the first real-world SF installation. We focus on deployment,
management, and operational aspects of our test cluster with 200 servers and
carefully analyze performance. We demonstrate techniques for simple cabling and
cabling validation as well as a novel high-performance routing architecture for
InfiniBand-based low-diameter topologies. Our real-world benchmarks show SF's
strong performance for many modern workloads such as deep neural network
training, graph analytics, or linear algebra kernels. SF outperforms
non-blocking Fat Trees in scalability while offering comparable or better
performance and lower cost for large network sizes. Our work can facilitate
deploying SF while the associated (open-source) routing architecture is fully
portable and applicable to accelerate any low-diameter interconnect
Modelling Energy Consumption based on Resource Utilization
Power management is an expensive and important issue for large computational
infrastructures such as datacenters, large clusters, and computational grids.
However, measuring energy consumption of scalable systems may be impractical
due to both cost and complexity for deploying power metering devices on a large
number of machines. In this paper, we propose the use of information about
resource utilization (e.g. processor, memory, disk operations, and network
traffic) as proxies for estimating power consumption. We employ machine
learning techniques to estimate power consumption using such information which
are provided by common operating systems. Experiments with linear regression,
regression tree, and multilayer perceptron on data from different hardware
resulted into a model with 99.94\% of accuracy and 6.32 watts of error in the
best case.Comment: Submitted to Journal of Supercomputing on 14th June, 201
Supercomputing Frontiers
This open access book constitutes the refereed proceedings of the 6th Asian Supercomputing Conference, SCFA 2020, which was planned to be held in February 2020, but unfortunately, the physical conference was cancelled due to the COVID-19 pandemic. The 8 full papers presented in this book were carefully reviewed and selected from 22 submissions. They cover a range of topics including file systems, memory hierarchy, HPC cloud platform, container image configuration workflow, large-scale applications, and scheduling
- …