1,338 research outputs found

    The HPCG benchmark: analysis, shared memory preliminary improvements and evaluation on an Arm-based platform

    Get PDF
    The High-Performance Conjugate Gradient (HPCG) benchmark complements the LINPACK benchmark in the performance evaluation coverage of large High-Performance Computing (HPC) systems. Due to its lower arithmetic intensity and higher memory pressure, HPCG is recognized as a more representative benchmark for data-center and irregular memory access pattern workloads, therefore its popularity and acceptance is raising within the HPC community. As only a small fraction of the reference version of the HPCG benchmark is parallelized with shared memory techniques (OpenMP), we introduce in this report two OpenMP parallelization methods. Due to the increasing importance of Arm architecture in the HPC scenario, we evaluate our HPCG code at scale on a state-of-the-art HPC system based on Cavium ThunderX2 SoC. We consider our work as a contribution to the Arm ecosystem: along with this technical report, we plan in fact to release our code for boosting the tuning of the HPCG benchmark within the Arm community.Postprint (author's final draft

    Strong scaling of general-purpose molecular dynamics simulations on GPUs

    Get PDF
    We describe a highly optimized implementation of MPI domain decomposition in a GPU-enabled, general-purpose molecular dynamics code, HOOMD-blue (Anderson and Glotzer, arXiv:1308.5587). Our approach is inspired by a traditional CPU-based code, LAMMPS (Plimpton, J. Comp. Phys. 117, 1995), but is implemented within a code that was designed for execution on GPUs from the start (Anderson et al., J. Comp. Phys. 227, 2008). The software supports short-ranged pair force and bond force fields and achieves optimal GPU performance using an autotuning algorithm. We are able to demonstrate equivalent or superior scaling on up to 3,375 GPUs in Lennard-Jones and dissipative particle dynamics (DPD) simulations of up to 108 million particles. GPUDirect RDMA capabilities in recent GPU generations provide better performance in full double precision calculations. For a representative polymer physics application, HOOMD-blue 1.0 provides an effective GPU vs. CPU node speed-up of 12.5x.Comment: 30 pages, 14 figure

    FluTAS: A GPU-accelerated finite difference code for multiphase flows

    Get PDF
    We present the Fluid Transport Accelerated Solver, FluTAS, a scalable GPU code for multiphase flows with thermal effects. The code solves the incompressible Navier-Stokes equation for two-fluid systems, with a direct FFT-based Poisson solver for the pressure equation. The interface between the two fluids is represented with the Volume of Fluid (VoF) method, which is mass conserving and well suited for complex flows thanks to its capacity of handling topological changes. The energy equation is explicitly solved and coupled with the momentum equation through the Boussinesq approximation. The code is conceived in a modular fashion so that different numerical methods can be used independently, the existing routines can be modified, and new ones can be included in a straightforward and sustainable manner. FluTAS is written in modern Fortran and parallelized using hybrid MPI/OpenMP in the CPU-only version and accelerated with OpenACC directives in the GPU implementation. We present different benchmarks to validate the code, and two large-scale simulations of fundamental interest in turbulent multiphase flows: isothermal emulsions in HIT and two-layer Rayleigh-B\'enard convection. FluTAS is distributed through a MIT license and arises from a collaborative effort of several scientists, aiming to become a flexible tool to study complex multiphase flows

    Machine learning application for development of a data-driven predictive model able to investigate quality of life scores in a rare disease.

    Get PDF
    BACKGROUND:Alkaptonuria (AKU) is an ultra-rare autosomal recessive disease caused by a mutation in the homogentisate 1,2-dioxygenase (HGD) gene. One of the main obstacles in studying AKU, and other ultra-rare diseases, is the lack of a standardized methodology to assess disease severity or response to treatment. Quality of Life scores (QoL) are a reliable way to monitor patients' clinical condition and health status. QoL scores allow to monitor the evolution of diseases and assess the suitability of treatments by taking into account patients' symptoms, general health status and care satisfaction. However, more comprehensive tools to study a complex and multi-systemic disease like AKU are needed. In this study, a Machine Learning (ML) approach was implemented with the aim to perform a prediction of QoL scores based on clinical data deposited in the ApreciseKUre, an AKU- dedicated database. METHOD:Data derived from 129 AKU patients have been firstly examined through a preliminary statistical analysis (Pearson correlation coefficient) to measure the linear correlation between 11 QoL scores. The variable importance in QoL scores prediction of 110 ApreciseKUre biomarkers has been then calculated using XGBoost, with K-nearest neighbours algorithm (k-NN) approach. Due to the limited number of data available, this model has been validated using surrogate data analysis. RESULTS:We identified a direct correlation of 6 (age, Serum Amyloid A, Chitotriosidase, Advanced Oxidation Protein Products, S-thiolated proteins and Body Mass Index) out of 110 biomarkers with the QoL health status, in particular with the KOOS (Knee injury and Osteoarthritis Outcome Score) symptoms (Relative Absolute Error (RAE) 0.25). The error distribution of surrogate-model (RAE 0.38) was unequivocally higher than the true-model one (RAE of 0.25), confirming the consistency of our dataset. Our data showed that inflammation, oxidative stress, amyloidosis and lifestyle of patients correlates with the QoL scores for physical status, while no correlation between the biomarkers and patients' mental health was present (RAE 1.1). CONCLUSIONS:This proof of principle study for rare diseases confirms the importance of database, allowing data management and analysis, which can be used to predict more effective treatments

    Quantum ESPRESSO: One Further Step toward the Exascale

    Get PDF
    We review the statusof the Quantum ESPRESSO softwaresuite for electronic-structure calculations based on plane waves,pseudopotentials, and density-functional theory. We highlight therecent developments in the porting to GPUs of the main codes, usingan approach based on OpenACC and CUDA Fortran offloading.We describe, in particular, the results achieved on linear-responsecodes, which are one of the distinctive features of the QuantumESPRESSO suite. We also present extensive performance benchmarkson different GPU-accelerated architectures for the main codes of thesuite

    Hardware calibrated learning to compensate heterogeneity in analog RRAM-based Spiking Neural Networks

    Full text link
    Spiking Neural Networks (SNNs) can unleash the full power of analog Resistive Random Access Memories (RRAMs) based circuits for low power signal processing. Their inherent computational sparsity naturally results in energy efficiency benefits. The main challenge implementing robust SNNs is the intrinsic variability (heterogeneity) of both analog CMOS circuits and RRAM technology. In this work, we assessed the performance and variability of RRAM-based neuromorphic circuits that were designed and fabricated using a 130 nm technology node. Based on these results, we propose a Neuromorphic Hardware Calibrated (NHC) SNN, where the learning circuits are calibrated on the measured data. We show that by taking into account the measured heterogeneity characteristics in the off-chip learning phase, the NHC SNN self-corrects its hardware non-idealities and learns to solve benchmark tasks with high accuracy. This work demonstrates how to cope with the heterogeneity of neurons and synapses for increasing classification accuracy in temporal tasks

    The polymorphism L412F in TLR3 inhibits autophagy and is a marker of severe COVID-19 in males

    Get PDF
    The polymorphism L412F in TLR3 has been associated with several infectious diseases. However, the mechanism underlying this association is still unexplored. Here, we show that the L412F polymorphism in TLR3 is a marker of severity in COVID-19. This association increases in the sub-cohort of males. Impaired macroautophagy/autophagy and reduced TNF/TNFα production was demonstrated in HEK293 cells transfected with TLR3L412F-encoding plasmid and stimulated with specific agonist poly(I:C). A statistically significant reduced survival at 28 days was shown in L412F COVID-19 patients treated with the autophagy-inhibitor hydroxychloroquine (p = 0.038). An increased frequency of autoimmune disorders such as co-morbidity was found in L412F COVID-19 males with specific class II HLA haplotypes prone to autoantigen presentation. Our analyses indicate that L412F polymorphism makes males at risk of severe COVID-19 and provides a rationale for reinterpreting clinical trials considering autophagy pathways. Abbreviations: AP: autophagosome; AUC: area under the curve; BafA1: bafilomycin A1; COVID-19: coronavirus disease-2019; HCQ: hydroxychloroquine; RAP: rapamycin; ROC: receiver operating characteristic; SARS-CoV-2: severe acute respiratory syndrome coronavirus 2; TLR: toll like receptor; TNF/TNF-α: tumor necrosis factor
    corecore