Search CORE

10,151 research outputs found

Analysing Astronomy Algorithms for GPUs and Beyond

Author: Amdahl
Asanovic
B. R. Barsdell
Bate
Belleman
Blelloch
Brunner
C. J. Fluke
Che
Clark
D. G. Barnes
Hamada
Harris
Högbom
Jonsson
Kayser
Knuth
Levoy
Moore
Nitadori
Schive
Schneider
Schneider
Taylor
Thompson
Wambsganss
Wayth
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

Astronomy depends on ever increasing computing power. Processor clock-rates have plateaued, and increased performance is now appearing in the form of additional processor cores on a single chip. This poses significant challenges to the astronomy software community. Graphics Processing Units (GPUs), now capable of general-purpose computation, exemplify both the difficult learning-curve and the significant speedups exhibited by massively-parallel hardware architectures. We present a generalised approach to tackling this paradigm shift, based on the analysis of algorithms. We describe a small collection of foundation algorithms relevant to astronomy and explain how they may be used to ease the transition to massively-parallel computing architectures. We demonstrate the effectiveness of our approach by applying it to four well-known astronomy problems: Hogbom CLEAN, inverse ray-shooting for gravitational lensing, pulsar dedispersion and volume rendering. Algorithms with well-defined memory access patterns and high arithmetic intensity stand to receive the greatest performance boost from massively-parallel architectures, while those that involve a significant amount of decision-making may struggle to take advantage of the available processing power.Comment: 10 pages, 3 figures, accepted for publication in MNRA

arXiv.org e-Print Archive

CiteSeerX

Crossref

Swinburne Research Bank

The projector algorithm: a simple parallel algorithm for computing Voronoi diagrams and Delaunay graphs

Author: Reem Daniel
Publication venue
Publication date: 12/08/2018
Field of study

The Voronoi diagram is a certain geometric data structure which has numerous applications in various scientific and technological fields. The theory of algorithms for computing 2D Euclidean Voronoi diagrams of point sites is rich and useful, with several different and important algorithms. However, this theory has been quite steady during the last few decades in the sense that no essentially new algorithms have entered the game. In addition, most of the known algorithms are serial in nature and hence cast inherent difficulties on the possibility to compute the diagram in parallel. In this paper we present the projector algorithm: a new and simple algorithm which enables the (combinatorial) computation of 2D Voronoi diagrams. The algorithm is significantly different from previous ones and some of the involved concepts in it are in the spirit of linear programming and optics. Parallel implementation is naturally supported since each Voronoi cell can be computed independently of the other cells. A new combinatorial structure for representing the cells (and any convex polytope) is described along the way and the computation of the induced Delaunay graph is obtained almost automatically.Comment: This is a major revision; re-organization and better presentation of some parts; correction of several inaccuracies; improvement of some proofs and figures; added references; modification of the title; the paper is long but more than half of it is composed of proofs and references: it is sufficient to look at pages 5, 7--11 in order to understand the algorith

arXiv.org e-Print Archive

Computational advances in gravitational microlensing: a comparison of CPU, GPU, and parallel, large data codes

Author: Abajas
Alcock
Anguita
Aubourg
B.R. Barsdell
Barnes
Bate
Belleman
Bond
C.J. Fluke
Calchi Novati
Chang
Chartas
Dai
Eigenbrod
Floyd
Ford
Fournier
G.F. Lewis
Garsden
Gould
Gould
H. Garsden
Harris
Irwin
Kayser
Keeton
Kochanek
Lewis
Lewis
Mediavilla
Mediavilla
Morgan
N.F. Bate
Owens
Poindexter
Pooley
Pooley
Rauch
Schive
Schneider
Schneider
Schneider
Sumi
Thompson
Udalski
Vanderriest
Wambsganss
Wambsganss
Wambsganss
Witt
Witt
Wyithe
Wyithe
Wyrzykowski
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

To assess how future progress in gravitational microlensing computation at high optical depth will rely on both hardware and software solutions, we compare a direct inverse ray-shooting code implemented on a graphics processing unit (GPU) with both a widely-used hierarchical tree code on a single-core CPU, and a recent implementation of a parallel tree code suitable for a CPU-based cluster supercomputer. We examine the accuracy of the tree codes through comparison with a direct code over a much wider range of parameter space than has been feasible before. We demonstrate that all three codes present comparable accuracy, and choice of approach depends on considerations relating to the scale and nature of the microlensing problem under investigation. On current hardware, there is little difference in the processing speed of the single-core CPU tree code and the GPU direct code, however the recent plateau in single-core CPU speeds means the existing tree code is no longer able to take advantage of Moore's law-like increases in processing speed. Instead, we anticipate a rapid increase in GPU capabilities in the next few years, which is advantageous to the direct code. We suggest that progress in other areas of astrophysical computation may benefit from a transition to GPUs through the use of "brute force" algorithms, rather than attempting to port the current best solution directly to a GPU language -- for certain classes of problems, the simple implementation on GPUs may already be no worse than an optimised single-core CPU version.Comment: 11 pages, 4 figures, accepted for publication in New Astronom

arXiv.org e-Print Archive

Crossref

Swinburne Research Bank

QuickCSG: Fast Arbitrary Boolean Combinations of N Solids

Author: Douze Matthijs
Franco Jean-Sébastien
Raffin Bruno
Publication venue
Publication date: 05/06/2017
Field of study

QuickCSG computes the result for general N-polyhedron boolean expressions without an intermediate tree of solids. We propose a vertex-centric view of the problem, which simplifies the identification of final geometric contributions, and facilitates its spatial decomposition. The problem is then cast in a single KD-tree exploration, geared toward the result by early pruning of any region of space not contributing to the final surface. We assume strong regularity properties on the input meshes and that they are in general position. This simplifying assumption, in combination with our vertex-centric approach, improves the speed of the approach. Complemented with a task-stealing parallelization, the algorithm achieves breakthrough performance, one to two orders of magnitude speedups with respect to state-of-the-art CPU algorithms, on boolean operations over two to dozens of polyhedra. The algorithm also outperforms GPU implementations with approximate discretizations, while producing an output without redundant facets. Despite the restrictive assumptions on the input, we show the usefulness of QuickCSG for applications with large CSG problems and strong temporal constraints, e.g. modeling for 3D printers, reconstruction from visual hulls and collision detection

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

QuickCSG: Fast Arbitrary Boolean Combinations of N Solids

Author: Douze Matthijs
Franco Jean-Sébastien
Raffin Bruno
Publication venue
Publication date: 01/01/1760
Field of study

arXiv.org e-Print Archive

Biblioteca Digital de la Comunidad de Madrid

Galiciana

A Geometry Model for Logarithmic-time Rendering

Author: Szécsi László
Publication venue
Publication date: 21/02/2014
Field of study

Repository of the Academy's Library

Empirical Evaluation of the Parallel Distribution Sweeping Framework on Multicore Architectures

Author: A. Aggarwal
D. Ajwani
J. Singler
J.L. Bentley
K. Mehlhorn
S. Kang
Publication venue
Publication date: 01/01/2013
Field of study

In this paper, we perform an empirical evaluation of the Parallel External Memory (PEM) model in the context of geometric problems. In particular, we implement the parallel distribution sweeping framework of Ajwani, Sitchinava and Zeh to solve batched 1-dimensional stabbing max problem. While modern processors consist of sophisticated memory systems (multiple levels of caches, set associativity, TLB, prefetching), we empirically show that algorithms designed in simple models, that focus on minimizing the I/O transfers between shared memory and single level cache, can lead to efficient software on current multicore architectures. Our implementation exhibits significantly fewer accesses to slow DRAM and, therefore, outperforms traditional approaches based on plane sweep and two-way divide and conquer.Comment: Longer version of ESA'13 pape

arXiv.org e-Print Archive

Crossref

The localized Delaunay triangulation and ad-hoc routing in heterogeneous environments

Author: Watson Mark Duncan
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

Ad-Hoc Wireless routing has become an important area of research in the last few years due to the massive increase in wireless devices. Computational Geometry is relevant in attempts to build stable, low power routing schemes. It is only recently, however, that models have been expanded to consider devices with a non-uniform broadcast range, and few properties are known. In particular, we find, via both theoretical and experimental methods, extremal properties for the Localized Delaunay Triangulation over the Mutual Inclusion Graph. We also provide a distributed, sub-quadratic algorithm for the generation of the structure

eCommons@USASK

University of Saskatchewan Research Archive