4 research outputs found

    A UPC++ Actor Library and Its Evaluation on a Shallow Water Proxy Application

    Get PDF
    Programmability is one of the key challenges of Exascale Computing. Using the actor model for distributed computations may be one solution. The actor model separates computation from communication while still enabling their over-lap. Each actor possesses specified communication endpoints to publish and receive information. Computations are undertaken based on the data available on these channels. We present a library that implements this programming model using UPC++, a PGAS library, and evaluate three different parallelization strategies, one based on rank-sequential execution, one based on multiple threads in a rank, and one based on OpenMP tasks. In an evaluation of our library using shallow water proxy applications, our solution compares favorably against an earlier implementation based on X10, and a BSP-based approach

    Extreme-scale computing and studies of intermittency, mixing of passive scalars and stratified flows in turbulence

    Get PDF
    Turbulent flows are known for the intermittent occurrence of intense strain rates and local rotation, and for its ability to provide efficient mixing. This thesis focuses on pursuing fundamental advances in physical understanding, using high-resolution Direct Numerical Simulations based on a Fourier pseudo-spectral approach. The computations are very demanding, while ever-larger simulations are required for studies of intermittency, where high Reynolds number and good small-scale resolution are important. A new batched asynchronous algorithm capable of extremely large problem sizes has been developed for dense node heterogeneous architecture machines like Summit. Optimizing data copies between CPU and GPU and communication over the network while overlapping data copies and computations are key to achieving good performance. Processing data residing on the larger CPU memory in batches on the GPU helps avoids limitations on problem size. Favorable performance is obtained up to a world-leading problem size of 18432^3 (over 6 trillion grid points) on 3072 Summit nodes. A more portable implementation using OpenMP is pursued to target 32768^3 problem size on the exascale machine Frontier expected in early 2022. Hero-sized simulations are often relatively short in time, which raises concerns regarding sampling and statistical independence. A Multiple Resolution Independent Simulations approach (MRIS) is developed to address this issue, via multiple short simulation segments evolving from lower-resolution datasets distributed over a longer physical time span. Using this approach, the effects of small-scale intermittency are studied through statistics of local averages of dissipation rate and enstrophy. The dissipation rate is further studied from a multifractal viewpoint. The MRIS approach is also used to study passive scalar intermittency and test for refined similarity hypothesis, through statistics of scalar dissipation rate at high Reynolds number. Lastly, density stratified flows are studied under both stable and unstable stratification, with anisotropy development studied through the Reynolds-stress budget.Ph.D

    Moments of parton distribution functions for the pion and rho meson from Nf = 2+1 lattice QCD

    Get PDF
    We compute the second Mellin moments of parton distribution functions for the pion and rho meson from Nf=2+1N_f = 2 + 1 lattice QCD using improved Wilson fermions. Our results are presented in terms of singlet and non-singlet flavor combinations and, for the first time, take disconnected contributions fully into account. Besides condensing the common knowledge about spin-1 structure functions and parton distribution functions, we provide a detailed description of the software stack implemented by our group, in order to compute quark-line connected three-point functions using stochastic estimators. The main application is based on the factorization of the entire correlation function into two parts which are evaluated with open spin- (and to some extent flavor-) indices. This allows us to estimate the two contributions of the factorization simultaneously for many different initial and final states and momenta, with little computational overhead. Our numerical analysis yields moments of the structure function F1F_1 (pion and rho) and of the structure function b1b_1, providing additional contributions in the case of spin-1 particles. To this end we use 26 gauge ensembles, mainly generated by the CLS effort, with pion masses ranging from 214 MeV up to 420 MeV and with five different lattice spacings in the range of 0.05 fm to 0.1 fm in our numerical analysis. This choice of gauge configurations enables us to resolve the quark mass dependencies reliably, as well as to extrapolate to the continuum limit. However, due to the resonance character of the rho meson, our final results are possibly contaminated by additional two-pion states, which we also discuss. We present our results in the MS‾\overline{\text{MS}} scheme at μ=2\mu = 2 GeV We find v2(u+d+s)=0.220(207),v2(u+d−2s)=0.344(28),a2(u+d+s)=0.285(295),a2(u+d−2s)=0.384(52),d2(u+d+s)=0.226(124),d2(u+d−2s)=0.163(39), v_2^{(u+d+s)} = 0.220 (207), v_2^{(u+d-2s)} = 0.344 (28), a_2^{(u+d+s)} = 0.285 (295), a_2^{(u+d-2s)} = 0.384 (52), d_2^{(u+d+s)} = 0.226 (124), d_2^{(u+d-2s)} = 0.163 (39), for the second moment v2v_2 of the pion structure function F1F_1, the second moment a2a_2 of the rho structure function F1F_1, and the second moment d2d_2 of the rho structure function b1b_1 respectively. Based on these values we finally conclude, that the valence quarks in the pion carry about 35% of the total momentum, in the rho the valence quarks carry about 40% of the total momentum, and the non-vanishing values for d2d_2 suggest that the quarks in the rho meson carry a substantial amount of orbital angular momentum
    corecore