Search CORE

7 research outputs found

The Green500 List: Escapades to Exascale

Author: Feng Wu-chun
Scogland Tom
Subramaniam Balaji
Publication venue
Publication date: 01/01/2011
Field of study

Energy efﬁciency is now a top priority. The ﬁrst four years of the Green500 have seen the importance of en- ergy efﬁciency in supercomputing grow from an afterthought to the forefront of innovation as we near a point where sys- tems will be forced to stop drawing more power. Even so, the landscape of efﬁciency in supercomputing continues to shift, with new trends emerging, and unexpected shifts in previous predictions. This paper offers an in-depth analysis of the new and shifting trends in the Green500. In addition, the analysis of- fers early indications of the track we are taking toward exas- cale, and what an exascale machine in 2018 is likely to look like. Lastly, we discuss the new efforts and collaborations toward designing and establishing better metrics, method- ologies and workloads for the measurement and analysis of energy-efﬁcient supercomputing

Computer Science Technical Reports @Virginia Tech

Crossref

Taming Multi-core Parallelism with Concurrent Mixin Layers

Author: Archuleta Jeremy
Scogland Tom
Tilevich Eli
Publication venue
Publication date: 01/01/2008
Field of study

The recent shift in computer system design to multi-core technology requires that the developer leverage explicit parallel programming techniques in order to utilize available performance. Nevertheless, developing the requisite parallel applications remains a prohibitively-difficult undertaking, particularly for the general programmer. To mitigate many of the challenges in creating concurrent software, this paper introduces a new parallel programming methodology that leverages feature-oriented programming (FOP) to logically decompose a product line architecture (PLA) into concurrent execution units. In addition, our efficient implementation of this methodology, that we call concurrent mixin layers, uses a layered architecture to facilitate the development of parallel applications. To validate our methodology and accompanying implementation, we present a case study of a product line of multimedia applications deployed within a typical multi-core environment. Our performance results demonstrate that a product line can be effectively transformed into parallel applications capable of utilizing multiple cores, thus improving performance. Furthermore, concurrent mixin layers significantly reduces the complexity of parallel programming by eliminating the need for the programmer to introduce explicit low-level concurrency control. Our initial experience gives us reason to believe that concurrent mixin layers is a promising technique for taming parallelism in multi-core environments

Computer Science Technical Reports @Virginia Tech

Accelerating Electrostatic Surface Potential Calculation with Multiscale Approximation on Graphics Processing Units

Author: Anandakrishnan Ramu
Feng Wuchun
Fenley Andrew T.
Gordon John
Onufriev Alexey
Scogland Tom
Publication venue
Publication date: 01/01/2009
Field of study

Tools that compute and visualize biomolecular electrostatic surface potential have been used extensively for studying biomolecular function. However, determining the surface potential for large biomolecules on a typical desktop computer can take days or longer using currently available tools and methods. This paper demonstrates how one can take advantage of graphic processing units (GPUs) available in today’s typical desktop computer, together with a multiscale approximation method, to significantly speedup such computations. Specifically, the electrostatic potential computation, using an analytical linearized Poisson Boltzmann (ALPB) method, is implemented on an ATI Radeon 4870 GPU in combination with the hierarchical charge partitioning (HCP) multiscale approximation. This implementation delivers a combined 1800-fold speedup for a 476,040 atom viral capsid

Computer Science Technical Reports @Virginia Tech

GPU First -- Execution of Legacy CPU Codes on GPUs

Author: Chapman Barbara
Doerfert Johannes
Scogland Tom
Tian Shilei
Publication venue
Publication date: 20/06/2023
Field of study

Utilizing GPUs is critical for high performance on heterogeneous systems. However, leveraging the full potential of GPUs for accelerating legacy CPU applications can be a challenging task for developers. The porting process requires identifying code regions amenable to acceleration, managing distinct memories, synchronizing host and device execution, and handling library functions that may not be directly executable on the device. This complexity makes it challenging for non-experts to leverage GPUs effectively, or even to start offloading parts of a large legacy application. In this paper, we propose a novel compilation scheme called "GPU First" that automatically compiles legacy CPU applications directly for GPUs without any modification of the application source. Library calls inside the application are either resolved through our partial libc GPU implementation or via automatically generated remote procedure calls to the host. Our approach simplifies the task of identifying code regions amenable to acceleration and enables rapid testing of code modifications on actual GPU hardware in order to guide porting efforts. Our evaluation on two HPC proxy applications with OpenMP CPU and GPU parallelism, four micro benchmarks with originally GPU only parallelism, as well as three benchmarks from the SPEC OMP 2012 suite featuring hand-optimized OpenMP CPU parallelism showcases the simplicity of porting host applications to the GPU. For existing parallel loops, we often match the performance of corresponding manually offloaded kernels, with up to 14.36x speedup on the GPU, validating that our GPU First methodology can effectively guide porting efforts of large legacy applications

arXiv.org e-Print Archive

Defer Mechanism for {C}

Author: Ballman Aaron
Gilding Alex
Gustedt Jens
Scogland Tom
Seacord Robert,
Uecker Martin
Wiedijk Freek
Publication venue: ISO TC1/SC22/WG14
Publication date: 29/09/2020
Field of study

International audienceThe defer mechanism can restore a previously known property or invariant that is altered duringthe processing of a code block. The defer mechanism is useful for paired operations, where oneoperation is performed at the start of a code block and the paired operation is performed beforeexiting the block. Because blocks can be exited using a variety of mechanisms, operations arefrequently paired incorrectly. The defer mechanism in C is intended to help ensure the properpairing of these operations. This pattern is common in resource management, synchronization,and outputting balanced strings (e.g., parenthesis or HTML).A separable feature of the defer mechanism is a panic/recover mechanism that allows errorhandling at a distance

INRIA a CCSD electronic archive server

Defer Mechanism for {C}

Author: Ballman Aaron
Gilding Alex
Gustedt Jens
Scogland Tom
Seacord Robert,
Uecker Martin
Wiedijk Freek
Publication venue: ISO TC1/SC22/WG14
Publication date: 29/09/2020
Field of study

INRIA a CCSD electronic archive server

HAL Descartes

Accelerating electrostatic surface potential calculation with multi-scale approximation on graphics processing units

Author: Abagyan
Alexey V. Onufriev
Anandakrishnan
Anderson
Andrew T. Fenley
Archuleta
Baker
Baker
Bashford
Beck
Berman
Buck
Cai
Carrier
Darden
Davis
Dynerman
Essmann
Fenley
Fersht
Friedrichs
Gordon
Gordon
Hardy
John C. Gordon
Klepeis
Koehl
Konecny
Kumar
Lambert
Madura
Moore
Narumi
Nicholls
Onufriev
Onufriev
Perutz
Ramu Anandakrishnan
Robertson
Ruscio
Ruvinsky
Sanner
Schlick
Shaw
Sheinerman
Sigalov
Sigalov
Sriram
Stone
Szabo
Teodoro
Tom R.W. Scogland
Toukmaji
Ufimtsev
Warshel
Whitten
Wu-chun Feng
Yang
York
Zhou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref