91 research outputs found

    Direct NN-body code on low-power embedded ARM GPUs

    Full text link
    This work arises on the environment of the ExaNeSt project aiming at design and development of an exascale ready supercomputer with low energy consumption profile but able to support the most demanding scientific and technical applications. The ExaNeSt compute unit consists of densely-packed low-power 64-bit ARM processors, embedded within Xilinx FPGA SoCs. SoC boards are heterogeneous architecture where computing power is supplied both by CPUs and GPUs, and are emerging as a possible low-power and low-cost alternative to clusters based on traditional CPUs. A state-of-the-art direct NN-body code suitable for astrophysical simulations has been re-engineered in order to exploit SoC heterogeneous platforms based on ARM CPUs and embedded GPUs. Performance tests show that embedded GPUs can be effectively used to accelerate real-life scientific calculations, and that are promising also because of their energy efficiency, which is a crucial design in future exascale platforms.Comment: 16 pages, 7 figures, 1 table, accepted for publication in the Computing Conference 2019 proceeding

    Accelerating NBODY6 with Graphics Processing Units

    Full text link
    We describe the use of Graphics Processing Units (GPUs) for speeding up the code NBODY6 which is widely used for direct NN-body simulations. Over the years, the N2N^2 nature of the direct force calculation has proved a barrier for extending the particle number. Following an early introduction of force polynomials and individual time-steps, the calculation cost was first reduced by the introduction of a neighbour scheme. After a decade of GRAPE computers which speeded up the force calculation further, we are now in the era of GPUs where relatively small hardware systems are highly cost-effective. A significant gain in efficiency is achieved by employing the GPU to obtain the so-called regular force which typically involves some 99 percent of the particles, while the remaining local forces are evaluated on the host. However, the latter operation is performed up to 20 times more frequently and may still account for a significant cost. This effort is reduced by parallel SSE/AVX procedures where each interaction term is calculated using mainly single precision. We also discuss further strategies connected with coordinate and velocity prediction required by the integration scheme. This leaves hard binaries and multiple close encounters which are treated by several regularization methods. The present nbody6-GPU code is well balanced for simulations in the particle range 104−2×10510^4-2 \times 10^5 for a dual GPU system attached to a standard PC.Comment: 8 pages, 3 figures, 2 tables, MNRAS accepte

    Analysing Astronomy Algorithms for GPUs and Beyond

    Full text link
    Astronomy depends on ever increasing computing power. Processor clock-rates have plateaued, and increased performance is now appearing in the form of additional processor cores on a single chip. This poses significant challenges to the astronomy software community. Graphics Processing Units (GPUs), now capable of general-purpose computation, exemplify both the difficult learning-curve and the significant speedups exhibited by massively-parallel hardware architectures. We present a generalised approach to tackling this paradigm shift, based on the analysis of algorithms. We describe a small collection of foundation algorithms relevant to astronomy and explain how they may be used to ease the transition to massively-parallel computing architectures. We demonstrate the effectiveness of our approach by applying it to four well-known astronomy problems: Hogbom CLEAN, inverse ray-shooting for gravitational lensing, pulsar dedispersion and volume rendering. Algorithms with well-defined memory access patterns and high arithmetic intensity stand to receive the greatest performance boost from massively-parallel architectures, while those that involve a significant amount of decision-making may struggle to take advantage of the available processing power.Comment: 10 pages, 3 figures, accepted for publication in MNRA

    Mergers and ejections of black holes in globular clusters

    Full text link
    We report on results of fully consistent N-body simulations of globular cluster models with N = 100 000 members containing neutron stars and black holes. Using the improved `algorithmic regularization' method of Hellstrom and Mikkola for compact subsystems, the new code NBODY7 enables for the first time general relativistic coalescence to be achieved for post-Newtonian terms and realistic parameters. Following an early stage of mass segregation, a few black holes form a small dense core which usually leads to the formation of one dominant binary. The subsequent evolution by dynamical shrinkage involves the competing processes of ejection and mergers by radiation energy loss. Unless the binary is ejected, long-lived triple systems often exhibit Kozai cycles with extremely high inner eccentricity (e > 0.999) which may terminate in coalescence at a few Schwarzschild radii. A characteristic feature is that ordinary stars as well as black holes and even BH binaries are ejected with high velocities. On the basis of the models studied so far, the results suggest a limited growth of a few remaining stellar mass black holes in globular clusters.Comment: 8 pages, 9 figures, accepted MNRAS, small typo correcte

    A pilgrimage to gravity on GPUs

    Get PDF
    In this short review we present the developments over the last 5 decades that have led to the use of Graphics Processing Units (GPUs) for astrophysical simulations. Since the introduction of NVIDIA's Compute Unified Device Architecture (CUDA) in 2007 the GPU has become a valuable tool for N-body simulations and is so popular these days that almost all papers about high precision N-body simulations use methods that are accelerated by GPUs. With the GPU hardware becoming more advanced and being used for more advanced algorithms like gravitational tree-codes we see a bright future for GPU like hardware in computational astrophysics.Comment: To appear in: European Physical Journal "Special Topics" : "Computer Simulations on Graphics Processing Units" . 18 pages, 8 figure

    Dynamical Processes in Globular Clusters

    Full text link
    Globular clusters are among the most congested stellar systems in the Universe. Internal dynamical evolution drives them toward states of high central density, while simultaneously concentrating the most massive stars and binary systems in their cores. As a result, these clusters are expected to be sites of frequent close encounters and physical collisions between stars and binaries, making them efficient factories for the production of interesting and observable astrophysical exotica. I describe some elements of the competition among stellar dynamics, stellar evolution, and other processes that control globular cluster dynamics, with particular emphasis on pathways that may lead to the formation of blue stragglers.Comment: Chapter 10, in Ecology of Blue Straggler Stars, H.M.J. Boffin, G. Carraro & G. Beccari (Eds), Astrophysics and Space Science Library, Springe

    N-body simulations of gravitational dynamics

    Full text link
    We describe the astrophysical and numerical basis of N-body simulations, both of collisional stellar systems (dense star clusters and galactic centres) and collisionless stellar dynamics (galaxies and large-scale structure). We explain and discuss the state-of-the-art algorithms used for these quite different regimes, attempt to give a fair critique, and point out possible directions of future improvement and development. We briefly touch upon the history of N-body simulations and their most important results.Comment: invited review (28 pages), to appear in European Physics Journal Plu

    Bcl-2 and β1-integrin predict survival in a tissue microarray of small cell lung cancer.

    Get PDF
    INTRODUCTION: Survival in small cell lung cancer (SCLC) is limited by the development of chemoresistance. Factors associated with chemoresistance in vitro have been difficult to validate in vivo. Both Bcl-2 and β(1)-integrin have been identified as in vitro chemoresistance factors in SCLC but their importance in patients remains uncertain. Tissue microarrays (TMAs) are useful to validate biomarkers but no large TMA exists for SCLC. We designed an SCLC TMA to study potential biomarkers of prognosis and then used it to clarify the role of both Bcl-2 and β(1)-integrin in SCLC. METHODS: A TMA was constructed consisting of 184 cases of SCLC and stained for expression of Bcl-2 and β(1)-integrin. The slides were scored and the role of the proteins in survival was determined using Cox regression analysis. A meta-analysis of the role of Bcl-2 expression in SCLC prognosis was performed based on published results. RESULTS: Both proteins were expressed at high levels in the SCLC cases. For Bcl-2 (n=140), the hazard ratio for death if the staining was weak in intensity was 0.55 (0.33-0.94, P=0.03) and for β(1)-integrin (n=151) was 0.60 (0.39-0.92, P=0.02). The meta-analysis showed an overall hazard ratio for low expression of Bcl-2 of 0.91(0.74-1.09). CONCLUSIONS: Both Bcl-2 and β(1)-integrin are independent prognostic factors in SCLC in this cohort although further validation is required to confirm their importance. A TMA of SCLC cases is feasible but challenging and an important tool for biomarker validation
    • …
    corecore