5 research outputs found
The MPI + CUDA Gaia AVU-GSR Parallel Solver Toward Next-generation Exascale Infrastructures
We ported to the GPU with CUDA the Astrometric Verification Unit-Global
Sphere Reconstruction (AVU-GSR) Parallel Solver developed for the ESA Gaia
mission, by optimizing a previous OpenACC porting of this application. The code
aims to find, with a [10,100]as precision, the astrometric parameters of
stars, the attitude and instrumental settings of the Gaia
satellite, and the global parameter of the parametrized Post-Newtonian
formalism, by solving a system of linear equations, , with the
LSQR iterative algorithm. The coefficient matrix of the final Gaia dataset
is large, with elements, and sparse, reaching a
size of 10-100 TB, typical for the Big Data analysis, which requires an
efficient parallelization to obtain scientific results in reasonable
timescales. The speedup of the CUDA code over the original AVU-GSR solver,
parallelized on the CPU with MPI+OpenMP, increases with the system size and the
number of resources, reaching a maximum of 14x, >9x over the OpenACC
application. This result is obtained by comparing the two codes on the CINECA
cluster Marconi100, with 4 V100 GPUs per node. After verifying the agreement
between the solutions of a set of systems with different sizes computed with
the CUDA and the OpenMP codes and that the solutions showed the required
precision, the CUDA code was put in production on Marconi100, essential for an
optimal AVU-GSR pipeline and the successive Gaia Data Releases. This analysis
represents a first step to understand the (pre-)Exascale behavior of a class of
applications that follow the same structure of this code. In the next months,
we plan to run this code on the pre-Exascale platform Leonardo of CINECA, with
4 next-generation A200 GPUs per node, toward a porting on this infrastructure,
where we expect to obtain even higher performances.Comment: 17 pages, 4 figures, 1 table, published on 1st August 2023 in
Publications of the Astronomical Society of the Pacific, 135, 07450
Wildfires impact on surface nitrogen oxides and ozone in Central Italy
AbstractA summer campaign in Central Italy was carried out to study the impact of fire emissions on the mixing ratios of surface trace gases. Observations with a selective and sensitive instrument that uses the laser induced fluorescence technique for direct measurements of nitrogen dioxide (NO2), show a significant increase of NO2 mixing ratios, in the evening, when a fire plume reached the observations site. The increase of NO2 mixing ratios is well correlated (R=0.83) with that of particulate matter (PM), which is one of the primary product of forest and grassland fires. The tight correlation between NO2 and PM is used to improve the performance of a statistical regression model to simulate the observed O3, and to highlight the effect of fire emissions on the O3 mixing ratios. The statistical regression model of O3 improves in terms of performance (bias reduction of 77% and agreement enhancement of 10% for slope and correlation coefficient) when PM2.5 is included as additional input and proxy of the fire emissions among the usual input parameters (meteorological data and NO2 mixing ratios). A case study, comparing observed and modeled O3 in different days (with and without fire plume), suggests an impact of fire emissions on the O3 mixing ratios of about 10%
The Gaia AVU-GSR parallel solver: preliminary porting with OpenACC parallelization language of a LSQR-based application in perspective of exascale systems
The Gaia Astrometric Verification Unit-Global Sphere Reconstruction (AVU-GSR) Parallel Solver aims to find the positions and the proper motions for ~10^8 stars in our galaxy, besides the attitude and the instrumental settings of the Gaia satellite, and the global parameter of the post Newtonian formalism. To find these parameters, the code solves a system of linear equations, Ă = , where the coefficient matrix is large, containing ~10^11 x 10^8 elements, and sparse. The system of equations is solved with a customized implementation of the iterative preconditioned (PC)-LSQR algorithm and is parallelized on the CPU with MPI+OpenMP, where the computation related to different horizontal portions of the coefficient matrix is assigned to different MPI processes and it is further parallelized on the OpenMP threads. To improve the code performance, we explored the feasibility of a porting of this application on a GPU environment, by replacing the OpenMP directives with the OpenACC correspondent ones. In this preliminary porting, the ~95% of the data is copied from the host (CPU) to the device (GPU) before the entire cycle of iterations, making the code compute bound rather than data-transfers bound. The OpenACC code accelerates of a factor of ~1.5 compared to the OpenMP code. The OpenACC application runs on multiple GPUs and it was tested on the CINECA SuperComputer Marconi100, with 4 V100 GPUs per node having 16 GB of memory each. A following porting, where the OpenACC language is replaced with CUDA, was performed, optimizing the preliminary porting with OpenACC. The CUDA code has just been put into production on Marconi100 and we plan to run it on the future pre-exascale platform Leonardo of CINECA, with 4 next-generation A100 GPUs per node
Molecular vibrations of OxygenâEvolving Complex and its synthetic mimic
Bioâinspired catalysis for artificial photosynthesis has been widely studied for decades, in particular, with the purpose of using bioâdisposable and nonâtoxic metals as building blocks. The characterisation of such catalysts has been achieved by using different kinds of spectroscopic methods, from Xâray crystallography to NMR spectroscopy. An artificial Mn4CaO4 cubane cluster with dangling Mn4 was synthesised in 2015 [Zhang etâ
al. Science 2015, 348, 690â693]; this cluster showed many structural similarities to that of the natural oxygenâevolving complex. An accurate structural and spectroscopic comparison between the natural and artificial systems is highly relevant to understand the catalytic mechanism. Among data from different techniques, the differential FTIR spectra (Sn+1âSn) of photosystemâ
II are still lacking a complete interpretation. The availability of IR data of the artificial cluster offers a unique opportunity to assign absolute absorption spectra on a wellâdefined and easier to interpret analogous moiety. The present work aims to investigate the novel inorganic compound as a model system for an oxygenâevolving complex through measurement of its spectroscopic properties. The experimental results are compared with calculations by using a variety of theoretical methods (normal mode analysis, effective normal mode analysis) in the S1 state. We underline the similarities and the differences in the computational spectra based on atomistic models of Mn4CaO5 and Mn4CaO4 complexes