Search CORE

50 research outputs found

Iso-energy-efficiency: An approach to power-constrained parallel computation

Author: Ge Rong
Kirk Cameron
Song Shuaiwen
Su Chunyi
Vishnu Abhinav
Publication venue
Publication date: 01/01/2010
Field of study

Future large scale high performance supercomputer systems require high energy efficiency to achieve exaflops computational power and beyond. Despite the need to understand energy efficiency in high-performance systems, there are few techniques to evaluate energy efficiency at scale. In this paper, we propose a system-level iso-energy-efficiency model to analyze, evaluate and predict energy-performance of data intensive parallel applications with various execution patterns running on large scale power-aware clusters. Our analytical model can help users explore the effects of machine and application dependent characteristics on system energy efficiency and isolate efficient ways to scale system parameters (e.g. processor count, CPU power/frequency, workload size and network bandwidth) to balance energy use and performance. We derive our iso-energy-efficiency model and apply it to the NAS Parallel Benchmarks on two power-aware clusters. Our results indicate that the model accurately predicts total system energy consumption within 5% error on average for parallel applications with various execution and communication patterns. We demonstrate effective use of the model for various application contexts and in scalability decision-making

Computer Science Technical Reports @Virginia Tech

Интеллектуальная поддержка оператора бортовой интеллектуальной системы с использованием имитационного моделирования

Author: Мьят Ньен Мое
Нечаев Ю.И.
Publication venue: Інститут проблем штучного інтелекту МОН України та НАН України
Publication date: 01/01/2008
Field of study

Обсуждается задача интеллектуальной поддержки оператора бортовой интеллектуальной системы. Особое внимание уделяется построению системы с использованием методов имитационного моделирования. Функционирование системы в режиме реального времени осуществляется на базе многопроцессорного вычислительного комплекса.Обговорюється задача інтелектуальної підтримки оператора бортової інтелектуальної системи. Особливу увагу приділено побудові системи з використанням методів імітаційного моделювання. Функціонування системи у режимі реального часу здійснюється на базі багатопроцесорного обчислювального комплексу.The task of intelligence support of the operator in onboard intellectual system is discussed. The special attention is given to construction of system with use of methods of imitating modeling. The functioning of system in a real time regime is carried out on the basis of the multiprocessor computer complex

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

Understanding communication patterns in HPCG

Author: Chester Dean G.
Jarvis Stephen A.
Wright Steven A.
Publication venue: 'Elsevier BV'
Publication date: 29/10/2018
Field of study

Conjugate Gradient (CG) algorithms form a large part of many HPC applications, examples include bioinformatics and weather applications. These algorithms allow numerical solutions to complex linear systems. Understanding how distributed implementations of these algorithms use a network interconnect will allow system designers to gain a deeper insight into their exacting requirements for existing and future applications. This short paper documents our initial investigation into the communication patterns present in the High Performance Conjugate Gradient (HPCG) benchmark. Through our analysis, we identify patterns and features which may warrant further investigation to improve the performance of CG algorithms and applications which make extensive use of them. In this paper, we capture communication traces from runs of the HPCG benchmark at a variety of different processor counts and then examine this data to identify potential performance bottlenecks. Initial results show that there is a fall in the throughput of the network when more processes are communicating with each other, due to network contention

University of Birmingham Research Portal

Warwick Research Archives Portal Repository

Exploitation of Dynamic Communication Patterns through Static Analysis

Author: de Supinski B
Kranzlmueller D
Panas T
Preissl R
Quinlan D
Schulz M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/06/2010
Field of study

Abstract not provide

Crossref

UNT Digital Library

In-depth Analysis On Parallel Processing Patterns for High-Performance Dataframes

Author: Abeykoon Vibhatha
Fox Geoffrey
Kamburugamuve Supun
Kanewela Thejaka Amila
Perera Niranda
Sarker Arup Kumar
Shan Kaiying
Staylor Mills
von Laszewski Gregor
Widanage Chathura
Publication venue
Publication date: 03/07/2023
Field of study

The Data Science domain has expanded monumentally in both research and industry communities during the past decade, predominantly owing to the Big Data revolution. Artificial Intelligence (AI) and Machine Learning (ML) are bringing more complexities to data engineering applications, which are now integrated into data processing pipelines to process terabytes of data. Typically, a significant amount of time is spent on data preprocessing in these pipelines, and hence improving its e fficiency directly impacts the overall pipeline performance. The community has recently embraced the concept of Dataframes as the de-facto data structure for data representation and manipulation. However, the most widely used serial Dataframes today (R, pandas) experience performance limitations while working on even moderately large data sets. We believe that there is plenty of room for improvement by taking a look at this problem from a high-performance computing point of view. In a prior publication, we presented a set of parallel processing patterns for distributed dataframe operators and the reference runtime implementation, Cylon [1]. In this paper, we are expanding on the initial concept by introducing a cost model for evaluating the said patterns. Furthermore, we evaluate the performance of Cylon on the ORNL Summit supercomputer

arXiv.org e-Print Archive