4,768 research outputs found

    A Customized 3D GPU Poisson Solver for Free Boundary Conditions

    Full text link
    A 3-dimensional GPU Poisson solver is developed for all possible combinations of free and periodic boundary conditions (BCs) along the three directions. It is benchmarked for various grid sizes and different BCs and a significant performance gain is observed for problems including one or more free BCs. The GPU Poisson solver is also benchmarked against two different CPU implementations of the same method and a significant amount of acceleration of the computation is observed with the GPU version.Comment: 10 pages, 5 figure

    A portable platform for accelerated PIC codes and its application to GPUs using OpenACC

    Get PDF
    We present a portable platform, called PIC_ENGINE, for accelerating Particle-In-Cell (PIC) codes on heterogeneous many-core architectures such as Graphic Processing Units (GPUs). The aim of this development is efficient simulations on future exascale systems by allowing different parallelization strategies depending on the application problem and the specific architecture. To this end, this platform contains the basic steps of the PIC algorithm and has been designed as a test bed for different algorithmic options and data structures. Among the architectures that this engine can explore, particular attention is given here to systems equipped with GPUs. The study demonstrates that our portable PIC implementation based on the OpenACC programming model can achieve performance closely matching theoretical predictions. Using the Cray XC30 system, Piz Daint, at the Swiss National Supercomputing Centre (CSCS), we show that PIC_ENGINE running on an NVIDIA Kepler K20X GPU can outperform the one on an Intel Sandybridge 8-core CPU by a factor of 3.4

    A Multi-Code Analysis Toolkit for Astrophysical Simulation Data

    Full text link
    The analysis of complex multiphysics astrophysical simulations presents a unique and rapidly growing set of challenges: reproducibility, parallelization, and vast increases in data size and complexity chief among them. In order to meet these challenges, and in order to open up new avenues for collaboration between users of multiple simulation platforms, we present yt (available at http://yt.enzotools.org/), an open source, community-developed astrophysical analysis and visualization toolkit. Analysis and visualization with yt are oriented around physically relevant quantities rather than quantities native to astrophysical simulation codes. While originally designed for handling Enzo's structure adaptive mesh refinement (AMR) data, yt has been extended to work with several different simulation methods and simulation codes including Orion, RAMSES, and FLASH. We report on its methods for reading, handling, and visualizing data, including projections, multivariate volume rendering, multi-dimensional histograms, halo finding, light cone generation and topologically-connected isocontour identification. Furthermore, we discuss the underlying algorithms yt uses for processing and visualizing data, and its mechanisms for parallelization of analysis tasks.Comment: 18 pages, 6 figures, emulateapj format. Resubmitted to Astrophysical Journal Supplement Series with revisions from referee. yt can be found at http://yt.enzotools.org

    Particle-in-Cell algorithms for emerging computer architectures

    Get PDF
    AbstractWe have designed Particle-in-Cell algorithms for emerging architectures. These algorithms share a common approach, using fine-grained tiles, but different implementations depending on the architecture. On the GPU, there were two different implementations, one with atomic operations and one with no data collisions, using CUDA C and Fortran. Speedups up to about 50 compared to a single core of the Intel i7 processor have been achieved. There was also an implementation for traditional multi-core processors using OpenMP which achieved high parallel efficiency. We believe that this approach should work for other emerging designs such as Intel Phi coprocessor from the Intel MIC architecture

    A preliminary approach to intelligent x-ray imaging for baggage inspection at airports

    Get PDF
    Identifying explosives in baggage at airports relies on being able to characterize the materials that make up an X-ray image. If a suspicion is generated during the imaging process (step 1), the image data could be enhanced by adapting the scanning parameters (step 2). This paper addresses the first part of this problem and uses textural signatures to recognize and characterize materials and hence enabling system control. Directional Gabor-type filtering was applied to a series of different X-ray images. Images were processed in such a way as to simulate a line scanning geometry. Based on our experiments with images of industrial standards and our own samples it was found that different materials could be characterized in terms of the frequency range and orientation of the filters. It was also found that the signal strength generated by the filters could be used as an indicator of visibility and optimum imaging conditions predicted

    Andy's Algorithms: new automated digital image analysis pipelines for FIJI.

    Full text link
    Quantification of cellular antigens and their interactions via antibody-based detection methods are widely used in scientific research. Accurate high-throughput quantitation of these assays using general image analysis software can be time consuming and challenging, particularly when attempted by users with limited image processing and analysis knowledge. To overcome this, we have designed Andy's Algorithms, a series of automated image analysis pipelines for FIJI, that permits rapid, accurate and reproducible batch-processing of 3,3'-diaminobenzidine (DAB) immunohistochemistry, proximity ligation assays (PLAs) and other common assays. Andy's Algorithms incorporates a step-by-step tutorial and optimization pipeline to make batch image analysis simple for the untrained user and adaptable across laboratories. Andy's algorithms provide a simpler, faster, standardized work flow compared to existing programs, while offering equivalent performance and additional features, in a free to use open-source application of FIJI. Andy's Algorithms are available at GitHub, publicly accessed at https://github.com/andlaw1841/Andy-s-Algorithm

    An efficient mixed-precision, hybrid CPU-GPU implementation of a fully implicit particle-in-cell algorithm

    Full text link
    Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been proposed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230,18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver, capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle-orbit computations from the field solver, while remaining fully self-consistent. This paper describes a very efficient, mixed-precision hybrid CPU-GPU implementation of the implicit PIC algorithm exploiting this feature. The JFNK solver is kept on the CPU in double precision (DP), while the implicit, charge-conserving, and adaptive particle mover is implemented on a GPU (graphics processing unit) using CUDA in single-precision (SP). Performance-oriented optimizations are introduced with the aid of the roofline model. The implicit particle mover algorithm is shown to achieve up to 400 GOp/s on a Nvidia GeForce GTX580. This corresponds to 25% absolute GPU efficiency against the peak theoretical performance, and is about 300 times faster than an equivalent serial CPU (Intel Xeon X5460) execution. For the test case chosen, the mixed-precision hybrid CPU-GPU solver is shown to over-perform the DP CPU-only serial version by a factor of \sim 100, without apparent loss of robustness or accuracy in a challenging long-timescale ion acoustic wave simulation.Comment: 25 pages, 6 figures, submitted to J. Comput. Phy
    • …
    corecore