11 research outputs found

    The UTMOST: A hybrid digital signal processor transforms the MOST

    Get PDF
    The Molonglo Observatory Synthesis Telescope (MOST) is an 18,000 square meter radio telescope situated some 40 km from the city of Canberra, Australia. Its operating band (820-850 MHz) is now partly allocated to mobile phone communications, making radio astronomy challenging. We describe how the deployment of new digital receivers (RX boxes), Field Programmable Gate Array (FPGA) based filterbanks and server-class computers equipped with 43 GPUs (Graphics Processing Units) has transformed MOST into a versatile new instrument (the UTMOST) for studying the dynamic radio sky on millisecond timescales, ideal for work on pulsars and Fast Radio Bursts (FRBs). The filterbanks, servers and their high-speed, low-latency network form part of a hybrid solution to the observatory's signal processing requirements. The emphasis on software and commodity off-the-shelf hardware has enabled rapid deployment through the re-use of proven 'software backends' for its signal processing. The new receivers have ten times the bandwidth of the original MOST and double the sampling of the line feed, which doubles the field of view. The UTMOST can simultaneously excise interference, make maps, coherently dedisperse pulsars, and perform real-time searches of coherent fan beams for dispersed single pulses. Although system performance is still sub-optimal, a pulsar timing and FRB search programme has commenced and the first UTMOST maps have been made. The telescope operates as a robotic facility, deciding how to efficiently target pulsars and how long to stay on source, via feedback from real-time pulsar folding. The regular timing of over 300 pulsars has resulted in the discovery of 7 pulsar glitches and 3 FRBs. The UTMOST demonstrates that if sufficient signal processing can be applied to the voltage streams it is possible to perform innovative radio science in hostile radio frequency environments.Comment: 12 pages, 6 figure

    The UTMOST: A Hybrid Digital Signal Processor Transforms the Molonglo Observatory Synthesis Telescope

    Get PDF
    The Molonglo Observatory Synthesis Telescope (MOST) is an 18000 m^2 radio telescope located 40 km from Canberra, Australia. Its operating band (820–851 MHz) is partly allocated to telecommunications, making radio astronomy challenging. We describe how the deployment of new digital receivers, Field Programmable Gate Array-based filterbanks, and server-class computers equipped with 43 Graphics Processing Units, has transformed the telescope into a versatile new instrument (UTMOST) for studying the radio sky on millisecond timescales. UTMOST has 10 times the bandwidth and double the field of view compared to the MOST, and voltage record and playback capability has facilitated rapid implementaton of many new observing modes, most of which operate commensally. UTMOST can simultaneously excise interference, make maps, coherently dedisperse pulsars, and perform real-time searches of coherent fan-beams for dispersed single pulses. UTMOST operates as a robotic facility, deciding how to efficiently target pulsars and how long to stay on source via real-time pulsar folding, while searching for single pulse events. Regular timing of over 300 pulsars has yielded seven pulsar glitches and three Fast Radio Bursts during commissioning. UTMOST demonstrates that if sufficient signal processing is applied to voltage streams, innovative science remains possible even in hostile radio frequency environments

    Energy reconstruction on the LHC ATLAS TileCal upgraded front end: feasibility study for a sROD co-processing unit

    Get PDF
    Dissertation presented in ful lment of the requirements for the degree of: Master of Science in Physics 2016The Phase-II upgrade of the Large Hadron Collider at CERN in the early 2020s will enable an order of magnitude increase in the data produced, unlocking the potential for new physics discoveries. In the ATLAS detector, the upgraded Hadronic Tile Calorimeter (TileCal) Phase-II front end read out system is currently being prototyped to handle a total data throughput of 5.1 TB/s, from the current 20.4 GB/s. The FPGA based Super Read Out Driver (sROD) prototype must perform an energy reconstruction algorithm on 2.88 GB/s raw data, or 275 million events per second. Due to the very high level of pro ciency required and time consuming nature of FPGA rmware development, it may be more e ective to implement certain complex energy reconstruction and monitoring algorithms on a general purpose, CPU based sROD co-processor. Hence, the feasibility of a general purpose ARM System on Chip based co-processing unit (PU) for the sROD is determined in this work. A PCI-Express test platform was designed and constructed to link two ARM Cortex-A9 SoCs via their PCI-Express Gen-2 x1 interfaces. Test results indicate that the latency of the PCI-Express interface is su ciently low and the data throughput is superior to that of alternative interfaces such as Ethernet, for use as an interconnect for the SoCs to the sROD. CPU performance benchmarks were performed on ve ARM development platforms to determine the CPU integer, oating point and memory system performance as well as energy e ciency. To complement the benchmarks, Fast Fourier Transform and Optimal Filtering (OF) applications were also tested. Based on the test results, in order for the PU to process 275 million events per second with OF, within the 6 s timing budget of the ATLAS triggering system, a cluster of three Tegra-K1, Cortex-A15 SoCs connected to the sROD via a Gen-2 x8 PCI-Express interface would be suitable. A high level design for the PU is proposed which surpasses the requirements for the sROD co-processor and can also be used in a general purpose, high data throughput system, with 80 Gb/s Ethernet and 15 GB/s PCI-Express throughput, using four X-Gene SoCs

    Radio-Astronomical Imaging on Accelerators

    Get PDF
    Imaging is considered the most compute-intensive and therefore most challenging part of a radio-astronomical data-processing pipeline. To reach the high dynamic ranges imposed by the high sensitivity and large field of view of the new generation of radio telescopes such as the Square Kilometre Array (SKA), we need to be able to correct for direction-independent effects (DIEs) such as the curvature of the earth as well as for direction-dependent time-varying effects (DDEs) such as those caused by the ionosphere during imaging. The novel Image-Domain gridding (IDG) algorithm was designed to avoid the performance bottlenecks of traditional imaging algorithms. We implement, optimize, and analyze the performance and energy efficiency of IDG on a variety of hardware platforms to find which platform is the best for IDG. We analyze traditional CPUs, as well as several accelerators architectures. IDG alleviates the limitations of traditional imaging algorithms while it enables the advantages of GPU acceleration: better performance at lower power consumption. The hardware-software co-design has resulted in a highly efficient imager. This makes IDG on GPUs an ideal candidate for meeting the computational and energy efficiency constraints of the SKA. IDG has been integrated with a widely-used astronomical imager (WSClean) and is now being used in production by a variety of different radio observatories such as LOFAR and the MWA. It is not only faster and more energy-efficient than its competitors, but it also produces better quality images

    Fast algorithm for real-time rings reconstruction

    Get PDF
    The GAP project is dedicated to study the application of GPU in several contexts in which real-time response is important to take decisions. The definition of real-time depends on the application under study, ranging from answer time of ÎĽs up to several hours in case of very computing intensive task. During this conference we presented our work in low level triggers [1] [2] and high level triggers [3] in high energy physics experiments, and specific application for nuclear magnetic resonance (NMR) [4] [5] and cone-beam CT [6]. Apart from the study of dedicated solution to decrease the latency due to data transport and preparation, the computing algorithms play an essential role in any GPU application. In this contribution, we show an original algorithm developed for triggers application, to accelerate the ring reconstruction in RICH detector when it is not possible to have seeds for reconstruction from external trackers

    Scientific Programming and Computer Architecture

    Get PDF
    A variety of programming models relevant to scientists explained, with an emphasis on how programming constructs map to parts of the computer.What makes computer programs fast or slow? To answer this question, we have to get behind the abstractions of programming languages and look at how a computer really works. This book examines and explains a variety of scientific programming models (programming models relevant to scientists) with an emphasis on how programming constructs map to different parts of the computer's architecture. Two themes emerge: program speed and program modularity. Throughout this book, the premise is to "get under the hood," and the discussion is tied to specific programs. The book digs into linkers, compilers, operating systems, and computer architecture to understand how the different parts of the computer interact with programs. It begins with a review of C/C++ and explanations of how libraries, linkers, and Makefiles work. Programming models covered include Pthreads, OpenMP, MPI, TCP/IP, and CUDA.The emphasis on how computers work leads the reader into computer architecture and occasionally into the operating system kernel. The operating system studied is Linux, the preferred platform for scientific computing. Linux is also open source, which allows users to peer into its inner workings. A brief appendix provides a useful table of machines used to time programs. The book's website (https://github.com/divakarvi/bk-spca) has all the programs described in the book as well as a link to the html text
    corecore