7 research outputs found

    EuroEXA - D2.6: Final ported application software

    Get PDF
    This document describes the ported software of the EuroEXA applications to the single CRDB testbed and it discusses the experiences extracted from porting and optimization activities that should be actively taken into account in future redesign and optimization. This document accompanies the ported application software, found in the EuroEXA private repository (https://github.com/euroexa). In particular, this document describes the status of the software for each of the EuroEXA applications, sketches the redesign and optimization strategy for each application, discusses issues and difficulties faced during the porting activities and the relative lesson learned. A few preliminary evaluation results have been presented, however the full evaluation will be discussed in deliverable 2.8

    Single Event Effects Assessment of UltraScale+ MPSoC Systems under Atmospheric Radiation

    Get PDF
    The AMD UltraScale+ XCZU9EG device is a Multi-Processor System-on-Chip (MPSoC) with embedded Programmable Logic (PL) that excels in many Edge (e.g., automotive or avionics) and Cloud (e.g., data centres) terrestrial applications. However, it incorporates a large amount of SRAM cells, making the device vulnerable to Neutron-induced Single Event Upsets (NSEUs) or otherwise soft errors. Semiconductor vendors incorporate soft error mitigation mechanisms to recover memory upsets (i.e., faults) before they propagate to the application output and become an error. But how effective are the MPSoC's mitigation schemes? Can they effectively recover upsets in high altitude or large scale applications under different workloads? This article answers the above research questions through a solid study that entails accelerated neutron radiation testing and dependability analysis. We test the device on a broad range of workloads, like multi-threaded software used for pose estimation and weather prediction or a software/hardware (SW/HW) co-design image classification application running on the AMD Deep Learning Processing Unit (DPU). Assuming a one-node MPSoC system in New York City (NYC) at 40k feet, all tested software applications achieve a Mean Time To Failure (MTTF) greater than 148 months, which shows that upsets are effectively recovered in the processing system of the MPSoC. However, the SW/HW co-design (i.e., DPU) in the same one-node system at 40k feet has an MTTF = 4 months due to the high failure rate of its PL accelerator, which emphasises that some MPSoC workloads may require additional NSEU mitigation schemes. Nevertheless, we show that the MTTF of the DPU can increase to 87 months without any overhead if one disregards the failure rate of tolerable errors since they do not affect the correctness of the classification output.Comment: This manuscript is under review at IEEE Transactions on Reliabilit

    Exploring the acceleration of the Met Office NERC Cloud model using FPGAs

    Get PDF
    The use of Field Programmable Gate Arrays (FPGAs) to accelerate computational kernels has the potential to be of great benefit to scientific codes and the HPC community in general. With the recent developments in FPGA programming technology, the ability to port kernels is becoming far more accessible. However, to gain reasonable performance from this technology it is not enough to simple transfer a code onto the FPGA, instead the algorithm must be rethought and recast in a data-flow style to suit the target architecture. In this paper we describe the porting, via HLS, of one of the most computationally intensive kernels of the Met Office NERC Cloud model (MONC), an atmospheric model used by climate and weather researchers, onto an FPGA. We describe in detail the steps taken to adapt the algorithm to make it suitable for the architecture and the impact this has on kernel performance. Using a PCIe mounted FPGA with on-board DRAM, we consider the integration on this kernel within a larger infrastructure and explore the performance characteristics of our approach in contrast to Intel CPUs that are popular in modern HPC machines, over problem sizes involving very large grids. The result of this work is an experience report detailing the challenges faced and lessons learnt in porting this complex computational kernel to FPGAs, as well as exploring the role that FPGAs can play and their fundamental limits in accelerating traditional HPC workloads.Comment: Preprint of article in proceedings, ISC High Performance 2019. Lecture Notes in Computer Science, vol 1188

    First steps in porting the LFRic Weather and Climate model to the FPGAs of the EuroExa architecture

    No full text
    In recent years, there has been renewed interest in the use of field-programmable gate arrays (FPGAs) for high-performance computing (HPC). In this paper, we explore the techniques required by traditional HPC programmers in porting HPC applications to FPGAs, using as an example the LFRic weather and climate model. We report on the first steps in porting LFRic to the FPGAs of the EuroExa architecture. We have used Vivado High-Level Syntheusywwi to implement a matrix-vector kernel from the LFRic code on a Xilinx UltraScale+ development board containing an XCZU9EG multiprocessor system-on-chip. We describe the porting of the code, discuss the optimization decisions, and report performance of 5.34 Gflop/s with double precision and 5.58 Gflop/s with single precision. We discuss sources of inefficiencies, comparisons with peak performance, comparisons with CPU and GPU performance (taking into account power and price), comparisons with published techniques, and comparisons with published performance, and we conclude with some comments on the prospects for future progress with FPGA acceleration of the weather forecast model. The realization of practical exascale-class high-performance computinems requires significant improvements in the energy efficiency of such systems and their components. This has generated interest in computer architectures which utilize accelerators alongside traditional CPUs. FPGAs offer huge potential as an accelerator which can deliver performance for scientific applications at high levels of energy efficiency. The EuroExa project is developing and building a high-performance architecture based upon ARM CPUs with FPGA acceleration targeting exascale-class performance within a realistic power budget
    corecore