Search CORE

159,192 research outputs found

A Compiler-based Framework For Automatic Extraction Of Program Skeletons For Exascale Hardware/software Co-design

Author: Dakshinamurthy Amruth Rudraiah
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2013
Field of study

The design of high-performance computing architectures requires performance analysis of largescale parallel applications to derive various parameters concerning hardware design and software development. The process of performance analysis and benchmarking an application can be done in several ways with varying degrees of fidelity. One of the most cost-effective ways is to do a coarse-grained study of large-scale parallel applications through the use of program skeletons. The concept of a “program skeleton” that we discuss in this paper is an abstracted program that is derived from a larger program where source code that is determined to be irrelevant is removed for the purposes of the skeleton. In this work, we develop a semi-automatic approach for extracting program skeletons based on compiler program analysis. We demonstrate correctness of our skeleton extraction process by comparing details from communication traces, as well as show the performance speedup of using skeletons by running simulations in the SST/macro simulator. Extracting such a program skeleton from a large-scale parallel program requires a substantial amount of manual effort and often introduces human errors. We outline a semi-automatic approach for extracting program skeletons from large-scale parallel applications that reduces cost and eliminates errors inherent in manual approaches. Our skeleton generation approach is based on the use of the extensible and open-source ROSE compiler infrastructure that allows us to perform flow and dependency analysis on larger programs in order to determine what code can be removed from the program to generate a skeleton

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Rudder roll stabilization for ships

Author: Amerongen J. van
Klugt P.G.M. van der
Nauta Lemke H.R. van
Publication venue: Pergamon Press
Publication date: 01/01/1990
Field of study

This paper describes the design of an autopilot for rudder roll stabilization for ships. This autopilot uses the rudder not only for course keeping but also for reduction of the roll. The system has a series of properties which make the controller design far from straightforward: the process has only one input (the rudder angle) and two outputs (the heading and the roll angle); the transfer from rudder to roll is non-minimum-phase; because large and high-frequency rudder motions are necessary, the non-linearities of the steering machine cannot be disregarded; the disturbances caused by the waves vary considerably in amplitude and frequency spectrum.\ud \ud In order to solve these problems a new approach to the LQG method has been developed. The control algorithms were tested by means of computer simulations, scale-model experiments and full-scale trials at sea. The results indicate that a rudder roll stabilization system is able to reduce the roll as well as a conventional fin stabilization system, while it requires less investments. Based on the results obtained in this project the Royal Netherlands Navy has decided to implement rudder roll stabilization on a series of ships under construction at this moment

University of Twente Research Information

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

Author: Blazewicz Marek
Brandt Steven R.
Ciznicki Milosz
Hinder Ian
Kierzynka Michal
Koppelman David M.
Löffler Frank
Schnetter Erik
Tao Jian
Publication venue: 'IOS Press'
Publication date: 01/01/2013
Field of study

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Louisiana State University

MPG.PuRe

Recommended from our members

Application of temporal streamflow descriptors in hydrologic model parameter estimation

Author: Gupta HV
Imam B
Shamir E
Sorooshian S
Publication venue: eScholarship, University of California
Publication date: 01/06/2005
Field of study

This paper presents a parameter estimation approach based on hydrograph descriptors that capture dominant streamflow characteristics at three timescales (monthly, yearly, and record extent). The scheme, entitled hydrograph descriptors multitemporal sensitivity analyses (HYDMUS), yields an ensemble of model simulations generated from a reduced parameter space, based on a set of streamflow descriptors that emphasize the timescale dynamics of streamflow record. In this procedure the posterior distributions of model parameters derived at coarser timescales are used to sample model parameters for the next finer timescale. The procedure was used to estimate the parameters of the Sacramento soil moisture accounting model (SAC-SMA) for the Leaf River, Mississippi. The results indicated that in addition to a significant reduction in the range of parameter uncertainty, HYDMUS improved parameter identifiability for all 13 of the model parameters. The performance of the procedure was compared to four previous calibration studies on the same watershed. Although our application of HYDMUS did not explicitly consider the error at each simulation time step during the calibration process, the model performance was, in some important respects, found to be better than in previous deterministic studies. Copyright 2005 by the American Geophysical Union

eScholarship - University of California

Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS

Author: A Arnold
A Faradjian
B Hess
C Schütte
G Wilson
JA Anderson
JC Phillips
KJ Bowers
KJ Bowers
L Verlet
M Eleftheriou
M Shirts
MJ Abraham
P Eastman
R Yokota
S Pronk
S Páll
U Essmann
W Humphrey
WM Brown
Y Andoh
Y Sugita
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

GROMACS is a widely used package for biomolecular simulation, and over the last two decades it has evolved from small-scale efficiency to advanced heterogeneous acceleration and multi-level parallelism targeting some of the largest supercomputers in the world. Here, we describe some of the ways we have been able to realize this through the use of parallelization on all levels, combined with a constant focus on absolute performance. Release 4.6 of GROMACS uses SIMD acceleration on a wide range of architectures, GPU offloading acceleration, and both OpenMP and MPI parallelism within and between nodes, respectively. The recent work on acceleration made it necessary to revisit the fundamental algorithms of molecular simulation, including the concept of neighborsearching, and we discuss the present and future challenges we see for exascale simulation - in particular a very fine-grained task parallelism. We also discuss the software management, code peer review and continuous integration testing required for a project of this complexity.Comment: EASC 2014 conference proceedin

arXiv.org e-Print Archive

Publikationer från KTH

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

MPG.PuRe