76 research outputs found

    Astrophysical Supercomputing with GPUs: Critical Decisions for Early Adopters

    Full text link
    General purpose computing on graphics processing units (GPGPU) is dramatically changing the landscape of high performance computing in astronomy. In this paper, we identify and investigate several key decision areas, with a goal of simplyfing the early adoption of GPGPU in astronomy. We consider the merits of OpenCL as an open standard in order to reduce risks associated with coding in a native, vendor-specific programming environment, and present a GPU programming philosophy based on using brute force solutions. We assert that effective use of new GPU-based supercomputing facilities will require a change in approach from astronomers. This will likely include improved programming training, an increased need for software development best-practice through the use of profiling and related optimisation tools, and a greater reliance on third-party code libraries. As with any new technology, those willing to take the risks, and make the investment of time and effort to become early adopters of GPGPU in astronomy, stand to reap great benefits.Comment: 13 pages, 5 figures, accepted for publication in PAS

    Advanced Architectures for Astrophysical Supercomputing

    Full text link
    Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of O(100×)O(100\times) in general-purpose computation -- performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.Comment: 4 pages, 1 figure, to appear in the proceedings of ADASS XIX, Oct 4-8 2009, Sapporo, Japan (ASP Conf. Series

    Accelerating incoherent dedispersion

    Full text link
    Incoherent dedispersion is a computationally intensive problem that appears frequently in pulsar and transient astronomy. For current and future transient pipelines, dedispersion can dominate the total execution time, meaning its computational speed acts as a constraint on the quality and quantity of science results. It is thus critical that the algorithm be able to take advantage of trends in commodity computing hardware. With this goal in mind, we present analysis of the 'direct', 'tree' and 'sub-band' dedispersion algorithms with respect to their potential for efficient execution on modern graphics processing units (GPUs). We find all three to be excellent candidates, and proceed to describe implementations in C for CUDA using insight gained from the analysis. Using recent CPU and GPU hardware, the transition to the GPU provides a speed-up of 9x for the direct algorithm when compared to an optimised quad-core CPU code. For realistic recent survey parameters, these speeds are high enough that further optimisation is unnecessary to achieve real-time processing. Where further speed-ups are desirable, we find that the tree and sub-band algorithms are able to provide 3-7x better performance at the cost of certain smearing, memory consumption and development time trade-offs. We finish with a discussion of the implications of these results for future transient surveys. Our GPU dedispersion code is publicly available as a C library at: http://dedisp.googlecode.com/Comment: 15 pages, 4 figures, 2 tables, accepted for publication in MNRA

    Three-dimensional shapelets and an automated classification scheme for dark matter haloes

    Full text link
    We extend the two-dimensional Cartesian shapelet formalism to d-dimensions. Concentrating on the three-dimensional case, we derive shapelet-based equations for the mass, centroid, root-mean-square radius, and components of the quadrupole moment and moment of inertia tensors. Using cosmological N-body simulations as an application domain, we show that three-dimensional shapelets can be used to replicate the complex sub-structure of dark matter halos and demonstrate the basis of an automated classification scheme for halo shapes. We investigate the shapelet decomposition process from an algorithmic viewpoint, and consider opportunities for accelerating the computation of shapelet-based representations using graphics processing units (GPUs).Comment: 19 pages, 11 figures, accepted for publication in MNRA

    Spotting Radio Transients with the help of GPUs

    Full text link
    Exploration of the time-domain radio sky has huge potential for advancing our knowledge of the dynamic universe. Past surveys have discovered large numbers of pulsars, rotating radio transients and other transient radio phenomena; however, they have typically relied upon off-line processing to cope with the high data and processing rate. This paradigm rules out the possibility of obtaining high-resolution base-band dumps of significant events or of performing immediate follow-up observations, limiting analysis power to what can be gleaned from detection data alone. To overcome this limitation, real-time processing and detection of transient radio events is required. By exploiting the significant computing power of modern graphics processing units (GPUs), we are developing a transient-detection pipeline that runs in real-time on data from the Parkes radio telescope. In this paper we discuss the algorithms used in our pipeline, the details of their implementation on the GPU and the challenges posed by the presence of radio frequency interference.Comment: 4 Pages. To appear in the proceedings of ADASS XXI, ed. P.Ballester and D.Egret, ASP Conf. Serie

    Real-Time RFI Mitigation for the Apertif Radio Transient System

    Get PDF
    Current and upcoming radio telescopes are being designed with increasing sensitivity to detect new and mysterious radio sources of astrophysical origin. While this increased sensitivity improves the likelihood of discoveries, it also makes these instruments more susceptible to the deleterious effects of Radio Frequency Interference (RFI). The challenge posed by RFI is exacerbated by the high data-rates achieved by modern radio telescopes, which require real-time processing to keep up with the data. Furthermore, the high data-rates do not allow for permanent storage of observations at high resolution. Offline RFI mitigation is therefore not possible anymore. The real-time requirement makes RFI mitigation even more challenging because, on one side, the techniques used for mitigation need to be fast and simple, and on the other side they also need to be robust enough to cope with just a partial view of the data. The Apertif Radio Transient System (ARTS) is the real-time, time-domain, transient detection instrument of the Westerbork Synthesis Radio Telescope (WSRT), processing 73 Gb of data per second. Even with a deep learning classifier, the ARTS pipeline requires state-of-the-art real-time RFI mitigation to reduce the number of false-positive detections. Our solution to this challenge is RFIm, a high-performance, open-source, tuned, and extensible RFI mitigation library. The goal of this library is to provide users with RFI mitigation routines that are designed to run in real-time on many-core accelerators, such as Graphics Processing Units, and that can be highly-tuned to achieve code and performance portability to different hardware platforms and scientific use-cases. Results on the ARTS show that we can achieve real-time RFI mitigation, with a minimal impact on the total execution time of the search pipeline, and considerably reduce the number of false-positives.Comment: 6 pages, 10 figures. To appear in Proceedings from the 2019 Radio Frequency Interference workshop (RFI 2019), Toulouse, France (23-26 September
    • …
    corecore