Search CORE

5 research outputs found

Performance Variability of Highly Parallel Architectures

Author: D. H. Bailey
S. Woo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

The design and evaluation of high performance computers has concentrated on increasing computational speed for applications. This performance is often measured on a well configured dedicated system to show the best case. In the real environment, resources are not always dedicated to a single task, and systems run tasks that may influence each other, so run times vary, sometimes to an unreasonably large extent. This paper explores the amount of variation seen across four large distributed memory systems in a systematic manner. It then analyzes the causes for the variations seen and discusses what can be done to decrease the variation without impacting performance

CiteSeerX

Crossref

eScholarship - University of California

UNT Digital Library

A high-performance computing framework for Monte Carlo ocean color simulations

Author: Ahmad
Altiparmak
Arsenjev
Bull
Carrington
Chen
Cleall
Colasanti
Cristina
D'Alimonte
D'Alimonte
D'Alimonte
D'Alimonte
D'Alimonte
D'Alimonte
D'Alimonte
D'Alimonte
D'Alimonte
Davide D'Alimonte
Demmel
Dongarra
Embrechts
Folino
Gallopoulos
Goela
Guo
Hammond
Herzallah
Hoisie
IOCCG
Ipek
José C. Cunha
Kajiyama
Kajiyama
Kurc
Lang
Li
Lindtjorn
Liu
Martinsen
Mathis
Miller
Mobley
Mobley
Nakajima
Parashar
Park
Pllana
Press
Roberts
Romanazzi
Roy
Sastry
Schiller
Siebers
Sá
Tamito Kajiyama
Wang
Youn
Zhang
Zibordi
Zibordi
Zibordi
Zibordi
Zibordi
Publication venue: 'Wiley'
Publication date: 01/02/2017
Field of study

This paper presents a high-performance computing (HPC) framework for Monte Carlo (MC) simulations in the ocean color (OC) application domain. The objective is to optimize a parallel MC radiative transfer code named MOX, developed by the authors to create a virtual marine environment for investigating the quality of OC data products derived from in situ measurements of in-water radiometric quantities. A consolidated set of solutions for performance modeling, prediction, and optimization is implemented to enhance the efficiency of MC OC simulations on HPC run-time infrastructures. HPC, machine learning, and adaptive computing techniques are applied taking into account a clear separation and systematic treatment of accuracy and precision requirements for large-scale MC OC simulations. The added value of the work is the integration of computational methods and tools for MC OC simulations in the form of an HPC-oriented problem-solving environment specifically tailored to investigate data acquisition and reduction methods for OC field measurements. Study results highlight the benefit of close collaboration between HPC and application domain researchers to improve the efficiency and flexibility of computer simulations in the marine optics application domain. (C) 2016 The Authors. Concurrency and Computation: Practice and Experience Published by John Wiley & Sons Ltd.Portuguese Foundation for Science and Technology (FCT/MEC) [PEst-OE/EEI/UI0527/2011]; ESA [22576/09/I-OL, ARG/003-025/1406/CIMA]; NOVA LINCS [UID/CEC/04516/2013]info:eu-repo/semantics/publishedVersio

Crossref

Repositório da Universidade Nova de Lisboa

Sapientia

Analyzing the Combined Effects of Measurement Error and Perturbation Error on Performance Measurement

Author: Stoker Geoffrey Melvin
Publication venue
Publication date: 01/01/2014
Field of study

Dynamic performance analysis of executing programs commonly relies on statistical profiling techniques to provide performance measurement results. When a program execution is sampled we learn something about the examined program, but also change, to some extent, the program's interaction with the underlying system and thus its behavior. The amount we learn diminishes (statistically) with each sample taken, while the change we affect with the intrusive sampling risks growing larger. Effectively sampling programs is challenging largely because of the opposing effects of the decreasing sampling error and increasing perturbation error. Achieving the highest overall level of confidence in measurement results requires striking an appropriate balance between the tensions inherent in these two types of errors. Despite the popularity of statistical profiling, published material typically only explains in general qualitative terms the motivation of the systematic sampling rates used. Given the importance of sampling, we argue in favor of the general principle of deliberate sample size selection and have developed and tested a technique for doing so. We present our idea of sample rate selection based on abstract and mathematical performance measurement models we developed that incorporate the effect of sampling on both measurement accuracy and perturbation effects. Our mathematical model predicts the sampling size at which the combination of the residual measurement error and the accumulating perturbation error is minimized. Our evaluation of the model with simulation, calibration programs, and selected programs from the SPEC CPU 2006 and SPEC OMP 2001 benchmark suites indicates that this idea has promise. Our results show that the predicted sample size is generally close to the best sampling rate and effectively avoids bad choices. Most importantly, adaptive sample rate selection is shown to perform better than a single selected rate in most cases

CiteSeerX

Digital Repository at the University of Maryland

Summarizing multiprocessor program execution with versatile, microarchitecture-independent snapshots

Author: Barr Kenneth C. (Kenneth Charles), 1978-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 131-137).Computer architects rely heavily on software simulation to evaluate, refine, and validate new designs before they are implemented. However, simulation time continues to increase as computers become more complex and multicore designs become more common. This thesis investigates software structures and algorithms for quickly simulating modern cache-coherent multiprocessors by amortizing the time spent to simulate the memory system and branch predictors. The Memory Timestamp Record (MTR) summarizes the directory and cache state of a multiprocessor system in a compact data structure. A single MTR snapshot is versatile enough to reconstruct the microarchitectural state resulting from various coherence protocols and cache organizations. The MTR may be quickly updated by each simulated processor during a fast-forwarding phase and optionally stored off-line for reuse. To fill large branch prediction tables, we introduce Branch Predictor-based Compression (BPC) which compactly stores a branch trace so that it may be used to fill in any branch predictor structure. An entire BPC trace requires less space than single discrete predictor snapshots, and it may be decompressed 3-6x faster than performing functional simulation.by Kenneth C. Barr.Ph.D

DSpace@MIT

QoS assurance and control of large scale distributed component based systems

Author: Roy Nilabja
Publication venue: VANDERBILT
Publication date
Field of study

Vanderbilt Electronic Thesis and Dissertation Archive