108 research outputs found

    Probabilistische Design-Methoden und ihre Anwendung bei der strukturmechanischen Auslegung von Turbinenschaufeln

    Get PDF
    Thermische und mechanische Randbedingungen sowie Materialeigenschaften und geometrische Großen unterliegen in realen Bauteilen einer gewissen Streuung. Während diese bei deterministischen strukturmechanischen Analysen, wie sie heute fast ausschliesslich Verwendung finden, nicht berücksichtigt werden, sondern lediglich eine „Musterantwort“ der Struktur ermittelt wird, können mit probabilistischen Design-Methoden auch die Verteilungsfunktionen der stochastischen Eingangsgrößen in die Strukturanalyse einbezogen werden. Als Resultat erhält man die empirischen Verteilungen der Ergebnisgroßen sowie die Sensitivitäten der stochastischen Modellparameter. Im Beitrag werden mehrere probabilistische Design-Methoden erläutert und deren Anwendbarkeit auf reale komplexe Bauteilberechnungen beurteilt. Am Beispiel einer probabilistischen Analyse der zyklischen Lebensdauer einer Gasturbinenschaufel werden schliesslich die Herausforderungen bei der Anwendung der direkten Monte-Carlo-Simulationsmethode auf reale Bauteile dargestellt sowie die Ergebnisse und die Vorgehensweise kritisch diskutiert

    A portable platform for accelerated PIC codes and its application to GPUs using OpenACC

    Get PDF
    We present a portable platform, called PIC_ENGINE, for accelerating Particle-In-Cell (PIC) codes on heterogeneous many-core architectures such as Graphic Processing Units (GPUs). The aim of this development is efficient simulations on future exascale systems by allowing different parallelization strategies depending on the application problem and the specific architecture. To this end, this platform contains the basic steps of the PIC algorithm and has been designed as a test bed for different algorithmic options and data structures. Among the architectures that this engine can explore, particular attention is given here to systems equipped with GPUs. The study demonstrates that our portable PIC implementation based on the OpenACC programming model can achieve performance closely matching theoretical predictions. Using the Cray XC30 system, Piz Daint, at the Swiss National Supercomputing Centre (CSCS), we show that PIC_ENGINE running on an NVIDIA Kepler K20X GPU can outperform the one on an Intel Sandybridge 8-core CPU by a factor of 3.4

    ORB5: a global electromagnetic gyrokinetic code using the PIC approach in toroidal geometry

    Get PDF
    This paper presents the current state of the global gyrokinetic code ORB5 as an update of the previous reference [Jolliet et al., Comp. Phys. Commun. 177 409 (2007)]. The ORB5 code solves the electromagnetic Vlasov-Maxwell system of equations using a PIC scheme and also includes collisions and strong flows. The code assumes multiple gyrokinetic ion species at all wavelengths for the polarization density and drift-kinetic electrons. Variants of the physical model can be selected for electrons such as assuming an adiabatic response or a ``hybrid'' model in which passing electrons are assumed adiabatic and trapped electrons are drift-kinetic. A Fourier filter as well as various control variates and noise reduction techniques enable simulations with good signal-to-noise ratios at a limited numerical cost. They are completed with different momentum and zonal flow-conserving heat sources allowing for temperature-gradient and flux-driven simulations. The code, which runs on both CPUs and GPUs, is well benchmarked against other similar codes and analytical predictions, and shows good scalability up to thousands of nodes

    Efficient Strict-Binning Particle-in-Cell Algorithm for Multi-Core SIMD Processors

    Get PDF
    International audienceParticle-in-Cell (PIC) codes are widely used for plasma simulations. On recent multi-core hardware, performance of these codes is often limited by memory bandwidth. We describe a multi-core PIC algorithm that achieves close-to-minimal number of memory transfers with the main memory, while at the same time exploiting SIMD instructions for numerical computations and exhibiting a high degree of OpenMP-level parallelism. Our algorithm keeps particles sorted by cell at every time step, and represents particles from a same cell using a linked list of fixed-capacity arrays, called chunks. Chunks support either sequential or atomic insertions, the latter being used to handle fast-moving particles. To validate our code, called Pic-Vert, we consider a 3d electrostatic Landau-damping simulation as well as a 2d3v transverse instability of magnetized electron holes. Performance results on a 24-core Intel Sky-lake hardware confirm the effectiveness of our algorithm, in particular its high throughput and its ability to cope with fast moving particles

    Ecological networks: Pursuing the shortest path, however narrow and crooked

    Get PDF
    International audienceRepresenting data as networks cuts across all sub-disciplines in ecology and evolutionary biology. Besides providing a compact representation of the interconnections between agents, network analysis allows the identification of especially important nodes, according to various metrics that often rely on the calculation of the shortest paths connecting any two nodes. While the interpretation of a shortest paths is straightforward in binary, unweighted networks, whenever weights are reported, the calculation could yield unexpected results. We analyzed 129 studies of ecological networks published in the last decade that use shortest paths, and discovered a methodological inaccuracy related to the edge weights used to calculate shortest paths (and related centrality measures), particularly in interaction networks. Specifically, 49% of the studies do not report sufficient information on the calculation to allow their replication, and 61% of the studies on weighted networks may contain errors in how shortest paths are calculated. Using toy models and empirical ecological data, we show how to transform the data prior to calculation and illustrate the pitfalls that need to be avoided. We conclude by proposing a five-point checklist to foster best-practices in the calculation and reporting of centrality measures in ecology and evolution studies. The last two decades have witnessed an exponential increase in the use of graph analysis in ecological and conservation studies (see refs. 1,2 for recent introductions to network theory in ecology and evolution). Networks (graphs) represent agents as nodes linked by edges representing pairwise relationships. For instance, a food web can be represented as a network of species (nodes) and their feeding relationships (edges) 3. Similarly, the spatial dynamics of a metapopulation can be analyzed by connecting the patches of suitable habitat (nodes) with edges measuring dispersal between patches 4. Data might either simply report the presence/absence of an edge (binary, unweighted networks), or provide a strength for each edge (weighted networks). In turn, these weights can represent a variety of ecologically-relevant quantities, depending on the system being described. For instance, edge weights can quantify interaction frequency (e.g., visitation networks 5), interaction strength (e.g., per-capita effect of one species on the growth rate of another 3), carbon-flow between trophic levels 6 , genetic similarity 7 , niche overlap (e.g., number of shared resources between two species 8), affinity 9 , dispersal probabilities (e.g., the rate at which individuals of a population move between patches 10), cost of dispersal between patches (e.g., resistance 11), etc. Despite such large variety of ecological network representations, a common task is the identification of nodes of high importance, such as keystone species in a food web, patches acting as stepping stones in a dispersal network , or genes with pleiotropic effects. The identification of important nodes is typically accomplished through centrality measures 5,12. Many centrality measures has been proposed, each probing complementary aspects of node-to-node relationships 13. For instance, Closeness centrality 14,15 highlights nodes that are "near" to all othe

    A bucket sort algorithm for the particle-in-cell method on manycore architectures

    No full text
    The Particle-In-Cell (PIC) method is effectively used in many scientific simulation codes. In order to optimize the performance of the PIC approach, data locality is required. This relies on efficient sorting algorithms. We present a bucket sort algorithm with small memory footprint for the PIC method targeting Graphics Processing Units (GPUs). Our sorting algorithm shows an increased performance with the amount of storage provided and with the orderliness of the particles. For our application where particles are presorted it performs better and requires less memory than other sorting algorithms in the literature. The overall PIC algorithm performs at its best if the sorting is applied

    Untersuchung der Reaktion gammagamma -> pi+pi- bei TASSO

    No full text
    SIGLECopy held by FIZ Karlsruhe; available from UB/TIB Hannover / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekDEGerman

    Towards the optimization of a gyrokinetic Particle-In-Cell (PIC) code on large-scale hybrid architectures

    No full text
    With the aim of enabling state-of-the-art gyrokinetic PIC codes to benefit from the performance of recent multithreaded devices, we developed an application from a platform called the "PIC-engine" [1, 2, 3] embedding simplified basic features of the PIC method. The application solves the gyrokinetic equations in a sheared plasma slab using B-spline finite elements up to fourth order to represent the self-consistent electrostatic field. Preliminary studies of the so-called Particle-In-Fourier (PIF) approach, which uses Fourier modes as basis functions in the periodic dimensions of the system instead of the real-space grid, show that this method can be faster than PIC for simulations with a small number of Fourier modes. Similarly to the PIC-engine, multiple levels of parallelism have been implemented using MPI+OpenMP [2] and MPI+OpenACC [1], the latter exploiting the computational power of GPUs without requiring complete code rewriting. It is shown that sorting particles [3] can lead to performance improvement by increasing data locality and vectorizing grid memory access. Weak scalability tests have been successfully run on the GPU-equipped Cray XC30 Piz Daint (at CSCS) up to 4,096 nodes. The reduced time-to-solution will enable more realistic and thus more computationally intensive simulations of turbulent transport in magnetic fusion devices
    corecore