1,039 research outputs found
From a FPGA Prototyping Platform to a Computing Platform: The MANGO Experience
[EN] In this paper we describe the evolution of the FPGA-based prototype deployed in the MANGO project, from a hardware prototyping platform of HPC architectures to a computing platform targeting HPC and AI applications. Our main goal is to reinvest on the MANGO cluster by providing a duality in its use for both large-scale hardware prototyping and highperformance computation. From our experience we can reach several interesting conclusions about the complexities and hurdles that lay below FPGA technologies, and therefore, shedding some light onto the real complexities that difficult the adoption of FPGAs on either large-scale pure HPC systems or on hybrid systems (HPC + BigData/Ai).This work is supported by the European Commission through
RECIPE and DeepHealth projects, under the Horizon 2020
program, grant number 801137 and 825111, respectively.Flich Cardo, J.; Tornero-Gavilá, R.; Rodríguez, D.; Russo, D.; Martínez Martínez, JM.; Hernández Luz, C. (2021). From a FPGA Prototyping Platform to a Computing Platform: The MANGO Experience. IEEE. 1-6. https://doi.org/10.23919/DATE51398.2021.94740511
A Host Interface Architecture and Implementation for ATM Networks
The advent of high speed networks has increased demands on processor architectures. These architectural demands are due to the increase in network bandwidth relative to the speeds of processor components. One important component for a high-performance system is the workstation-to-network host interface . The solution presented in this thesis migrates a carefully selected set of protocol processing functions into hardware. The host interface is highly parallel and all per cell functions are performed by dedicated logic to maximize performance. There is a clean separation between the interface functions, such as segmentation and reassembly, and the interface/host communication. This architecture has been realized in a prototype which connects an IBM RISC System/6000 workstation to a SONET-based ATM network carrying data at the OC-3c1 rate of 155 Mbps
The AXIOM software layers
AXIOM project aims at developing a heterogeneous computing board (SMP-FPGA).The Software Layers developed at the AXIOM project are explained.OmpSs provides an easy way to execute heterogeneous codes in multiple cores. People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed.The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).Peer ReviewedPostprint (author's final draft
Assessment and Control of Spacecraft Charging Risks on the International Space Station
Electrical interactions between the F2 region ionospheric plasma and the 160V photovoltaic (PV) electrical power system on the International Space Station (ISS) can produce floating potentials (FP) on the ISS conducting structure of greater magnitude than are usually observed on spacecraft in low-Earth orbit. Flight through the geomagnetic field also causes magnetic induction charging of ISS conducting structure. Charging processes resulting from interaction of ISS with auroral electrons may also contribute to charging albeit rarely. The magnitude and frequency of occurrence of possibly hazardous charging events depends on the ISS assembly stage (six more 160V PV arrays will be added to ISS), ISS flight configuration, ISS position (latitude and longitude), and the natural variability in the ionospheric flight environment. At present, ISS is equipped with two plasma contactors designed to control ISS FP to within 40 volts of the ambient F2 plasma. The negative-polarity grounding scheme utilized in the ISS 160V power system leads, naturally, to negative values of ISS FP. A negative ISS structural FP leads to application of electrostatic fields across the dielectrics that separate conducting structure from the ambient F2 plasma, thereby enabling dielectric breakdown and arcing. Degradation of some thermal control coatings and noise in electrical systems can result. Continued review and evaluation of the putative charging hazards, as required by the ISS Program Office, revealed that ISS charging could produce a risk of electric shock to the ISS crew during extra vehicular activity. ISS charging risks are being evaluated in ongoing ISS charging measurements and analysis campaigns. The results of ISS charging measurements are combined with a recently developed detailed model of the ISS charging process and an extensive analysis of historical ionospheric variability data, to assess ISS charging risks using Probabilistic Risk Assessment (PRA) methods. The PRA analysis (estimated frequency of occurrence and severity of the charging hazards) are then used to select the hazard control strategy that provides the best overall safety and mission success environment for ISS and the ISS crew. This paper presents: 1) a summary of ISS spacecraft charging analysis, measurements, observations made to date, 2) plans for future ISS spacecraft charging measurement campaigns, and 3) a detailed discussion of the PRA strategy used to assess ISS spacecraft charging risks and select charging hazard control strategie
Efficient proofs of software exploitability for real-world processors
CRAhttps://eprint.iacr.org/2022/1223.pdfPublished versio
nsCouette – A high-performance code for direct numerical simulations of turbulent Taylor–Couette flow
We present nsCouette, a highly scalable software tool to solve the Navier–Stokes equations for incompressible fluid flow between differentially heated and independently rotating, concentric cylinders. It is based on a pseudospectral spatial discretization and dynamic time-stepping. It is implemented in modern Fortran with a hybrid MPI-OpenMP parallelization scheme and thus designed to compute turbulent flows at high Reynolds and Rayleigh numbers. An additional GPU implementation (C-CUDA) for intermediate problem sizes and a version for pipe flow (nsPipe) are also provided
Efficient direct convolution using long SIMD instructions
This paper demonstrates that state-of-the-art proposals to compute convolutions on architectures with CPUs supporting SIMD instructions deliver poor performance for long SIMD lengths due to frequent cache conflict misses. We first discuss how to adapt the state-of-the-art SIMD direct convolution to architectures using long SIMD instructions and analyze the implications of increasing the SIMD length on the algorithm formulation. Next, we propose two new algorithmic approaches: the Bounded Direct Convolution (BDC), which adapts the amount of computation exposed to mitigate cache misses, and the Multi-Block Direct Convolution (MBDC), which redefines the activation memory layout to improve the memory access pattern. We evaluate BDC, MBDC, the state-of-the-art technique, and a proprietary library on an architecture featuring CPUs with 16,384-bit SIMD registers using ResNet convolutions. Our results show that BDC and MBDC achieve respective speed-ups of 1.44× and 1.28× compared to the state-of-the-art technique for ResNet-101, and 1.83× and 1.63× compared to the proprietary library.This work receives EuroHPC-JU funding under grant no. 101034126, with support from the Horizon2020 program. Adrià Armejach is a Serra Hunter Fellow and has been partially supported by the Grant IJCI-2017-33945 funded by MCIN/AEI/10.13039/501100011033. Marc Casas has been par-tially supported by the Grant RYC-2017-23269 funded by MCIN/AEI/10.13039/501100011033 and ESF Investing in your future. This work is supported by the Spanish Ministry of Science and Technology through the PID2019-107255GB project and the Generalitat de Catalunya (contract 2017-SGR-1414).Peer ReviewedPostprint (author's final draft
- …