91 research outputs found
Direct -body code on low-power embedded ARM GPUs
This work arises on the environment of the ExaNeSt project aiming at design
and development of an exascale ready supercomputer with low energy consumption
profile but able to support the most demanding scientific and technical
applications. The ExaNeSt compute unit consists of densely-packed low-power
64-bit ARM processors, embedded within Xilinx FPGA SoCs. SoC boards are
heterogeneous architecture where computing power is supplied both by CPUs and
GPUs, and are emerging as a possible low-power and low-cost alternative to
clusters based on traditional CPUs. A state-of-the-art direct -body code
suitable for astrophysical simulations has been re-engineered in order to
exploit SoC heterogeneous platforms based on ARM CPUs and embedded GPUs.
Performance tests show that embedded GPUs can be effectively used to accelerate
real-life scientific calculations, and that are promising also because of their
energy efficiency, which is a crucial design in future exascale platforms.Comment: 16 pages, 7 figures, 1 table, accepted for publication in the
Computing Conference 2019 proceeding
Accelerating NBODY6 with Graphics Processing Units
We describe the use of Graphics Processing Units (GPUs) for speeding up the
code NBODY6 which is widely used for direct -body simulations. Over the
years, the nature of the direct force calculation has proved a barrier
for extending the particle number. Following an early introduction of force
polynomials and individual time-steps, the calculation cost was first reduced
by the introduction of a neighbour scheme. After a decade of GRAPE computers
which speeded up the force calculation further, we are now in the era of GPUs
where relatively small hardware systems are highly cost-effective. A
significant gain in efficiency is achieved by employing the GPU to obtain the
so-called regular force which typically involves some 99 percent of the
particles, while the remaining local forces are evaluated on the host. However,
the latter operation is performed up to 20 times more frequently and may still
account for a significant cost. This effort is reduced by parallel SSE/AVX
procedures where each interaction term is calculated using mainly single
precision. We also discuss further strategies connected with coordinate and
velocity prediction required by the integration scheme. This leaves hard
binaries and multiple close encounters which are treated by several
regularization methods. The present nbody6-GPU code is well balanced for
simulations in the particle range for a dual GPU system
attached to a standard PC.Comment: 8 pages, 3 figures, 2 tables, MNRAS accepte
Analysing Astronomy Algorithms for GPUs and Beyond
Astronomy depends on ever increasing computing power. Processor clock-rates
have plateaued, and increased performance is now appearing in the form of
additional processor cores on a single chip. This poses significant challenges
to the astronomy software community. Graphics Processing Units (GPUs), now
capable of general-purpose computation, exemplify both the difficult
learning-curve and the significant speedups exhibited by massively-parallel
hardware architectures. We present a generalised approach to tackling this
paradigm shift, based on the analysis of algorithms. We describe a small
collection of foundation algorithms relevant to astronomy and explain how they
may be used to ease the transition to massively-parallel computing
architectures. We demonstrate the effectiveness of our approach by applying it
to four well-known astronomy problems: Hogbom CLEAN, inverse ray-shooting for
gravitational lensing, pulsar dedispersion and volume rendering. Algorithms
with well-defined memory access patterns and high arithmetic intensity stand to
receive the greatest performance boost from massively-parallel architectures,
while those that involve a significant amount of decision-making may struggle
to take advantage of the available processing power.Comment: 10 pages, 3 figures, accepted for publication in MNRA
Mergers and ejections of black holes in globular clusters
We report on results of fully consistent N-body simulations of globular
cluster models with N = 100 000 members containing neutron stars and black
holes. Using the improved `algorithmic regularization' method of Hellstrom and
Mikkola for compact subsystems, the new code NBODY7 enables for the first time
general relativistic coalescence to be achieved for post-Newtonian terms and
realistic parameters. Following an early stage of mass segregation, a few black
holes form a small dense core which usually leads to the formation of one
dominant binary. The subsequent evolution by dynamical shrinkage involves the
competing processes of ejection and mergers by radiation energy loss. Unless
the binary is ejected, long-lived triple systems often exhibit Kozai cycles
with extremely high inner eccentricity (e > 0.999) which may terminate in
coalescence at a few Schwarzschild radii. A characteristic feature is that
ordinary stars as well as black holes and even BH binaries are ejected with
high velocities. On the basis of the models studied so far, the results suggest
a limited growth of a few remaining stellar mass black holes in globular
clusters.Comment: 8 pages, 9 figures, accepted MNRAS, small typo correcte
A pilgrimage to gravity on GPUs
In this short review we present the developments over the last 5 decades that
have led to the use of Graphics Processing Units (GPUs) for astrophysical
simulations. Since the introduction of NVIDIA's Compute Unified Device
Architecture (CUDA) in 2007 the GPU has become a valuable tool for N-body
simulations and is so popular these days that almost all papers about high
precision N-body simulations use methods that are accelerated by GPUs. With the
GPU hardware becoming more advanced and being used for more advanced algorithms
like gravitational tree-codes we see a bright future for GPU like hardware in
computational astrophysics.Comment: To appear in: European Physical Journal "Special Topics" : "Computer
Simulations on Graphics Processing Units" . 18 pages, 8 figure
Dynamical Processes in Globular Clusters
Globular clusters are among the most congested stellar systems in the
Universe. Internal dynamical evolution drives them toward states of high
central density, while simultaneously concentrating the most massive stars and
binary systems in their cores. As a result, these clusters are expected to be
sites of frequent close encounters and physical collisions between stars and
binaries, making them efficient factories for the production of interesting and
observable astrophysical exotica. I describe some elements of the competition
among stellar dynamics, stellar evolution, and other processes that control
globular cluster dynamics, with particular emphasis on pathways that may lead
to the formation of blue stragglers.Comment: Chapter 10, in Ecology of Blue Straggler Stars, H.M.J. Boffin, G.
Carraro & G. Beccari (Eds), Astrophysics and Space Science Library, Springe
N-body simulations of gravitational dynamics
We describe the astrophysical and numerical basis of N-body simulations, both
of collisional stellar systems (dense star clusters and galactic centres) and
collisionless stellar dynamics (galaxies and large-scale structure). We explain
and discuss the state-of-the-art algorithms used for these quite different
regimes, attempt to give a fair critique, and point out possible directions of
future improvement and development. We briefly touch upon the history of N-body
simulations and their most important results.Comment: invited review (28 pages), to appear in European Physics Journal Plu
Bcl-2 and β1-integrin predict survival in a tissue microarray of small cell lung cancer.
INTRODUCTION: Survival in small cell lung cancer (SCLC) is limited by the development of chemoresistance. Factors associated with chemoresistance in vitro have been difficult to validate in vivo. Both Bcl-2 and β(1)-integrin have been identified as in vitro chemoresistance factors in SCLC but their importance in patients remains uncertain. Tissue microarrays (TMAs) are useful to validate biomarkers but no large TMA exists for SCLC. We designed an SCLC TMA to study potential biomarkers of prognosis and then used it to clarify the role of both Bcl-2 and β(1)-integrin in SCLC. METHODS: A TMA was constructed consisting of 184 cases of SCLC and stained for expression of Bcl-2 and β(1)-integrin. The slides were scored and the role of the proteins in survival was determined using Cox regression analysis. A meta-analysis of the role of Bcl-2 expression in SCLC prognosis was performed based on published results. RESULTS: Both proteins were expressed at high levels in the SCLC cases. For Bcl-2 (n=140), the hazard ratio for death if the staining was weak in intensity was 0.55 (0.33-0.94, P=0.03) and for β(1)-integrin (n=151) was 0.60 (0.39-0.92, P=0.02). The meta-analysis showed an overall hazard ratio for low expression of Bcl-2 of 0.91(0.74-1.09). CONCLUSIONS: Both Bcl-2 and β(1)-integrin are independent prognostic factors in SCLC in this cohort although further validation is required to confirm their importance. A TMA of SCLC cases is feasible but challenging and an important tool for biomarker validation
- …