Search CORE

22,096 research outputs found

Iso-energy-efficiency: An approach to power-constrained parallel computation

Author: Ge Rong
Kirk Cameron
Song Shuaiwen
Su Chunyi
Vishnu Abhinav
Publication venue
Publication date: 01/01/2010
Field of study

Future large scale high performance supercomputer systems require high energy efficiency to achieve exaflops computational power and beyond. Despite the need to understand energy efficiency in high-performance systems, there are few techniques to evaluate energy efficiency at scale. In this paper, we propose a system-level iso-energy-efficiency model to analyze, evaluate and predict energy-performance of data intensive parallel applications with various execution patterns running on large scale power-aware clusters. Our analytical model can help users explore the effects of machine and application dependent characteristics on system energy efficiency and isolate efficient ways to scale system parameters (e.g. processor count, CPU power/frequency, workload size and network bandwidth) to balance energy use and performance. We derive our iso-energy-efficiency model and apply it to the NAS Parallel Benchmarks on two power-aware clusters. Our results indicate that the model accurately predicts total system energy consumption within 5% error on average for parallel applications with various execution and communication patterns. We demonstrate effective use of the model for various application contexts and in scalability decision-making

Computer Science Technical Reports @Virginia Tech

A Benchmark for Image Retrieval using Distributed Systems over the Internet: BIRDS-I

Author: Beretta Giordano B.
Gunther Neil J.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 22/12/2000
Field of study

The performance of CBIR algorithms is usually measured on an isolated workstation. In a real-world environment the algorithms would only constitute a minor component among the many interacting components. The Internet dramati-cally changes many of the usual assumptions about measuring CBIR performance. Any CBIR benchmark should be designed from a networked systems standpoint. These benchmarks typically introduce communication overhead because the real systems they model are distributed applications. We present our implementation of a client/server benchmark called BIRDS-I to measure image retrieval performance over the Internet. It has been designed with the trend toward the use of small personalized wireless systems in mind. Web-based CBIR implies the use of heteroge-neous image sets, imposing certain constraints on how the images are organized and the type of performance metrics applicable. BIRDS-I only requires controlled human intervention for the compilation of the image collection and none for the generation of ground truth in the measurement of retrieval accuracy. Benchmark image collections need to be evolved incrementally toward the storage of millions of images and that scaleup can only be achieved through the use of computer-aided compilation. Finally, our scoring metric introduces a tightly optimized image-ranking window.Comment: 24 pages, To appear in the Proc. SPIE Internet Imaging Conference 200

arXiv.org e-Print Archive

CiteSeerX

Crossref

FooPar: A Functional Object Oriented Parallel Framework in Scala

Author: A Grama
A Grama
E Dekel
F Darema
GL Taboada
K Hwang
M Odersky
MJ Quinn
R Loogen
SJ Thompson
V Kumar
Publication venue
Publication date: 13/06/2013
Field of study

We present FooPar, an extension for highly efficient Parallel Computing in the multi-paradigm programming language Scala. Scala offers concise and clean syntax and integrates functional programming features. Our framework FooPar combines these features with parallel computing techniques. FooPar is designed modular and supports easy access to different communication backends for distributed memory architectures as well as high performance math libraries. In this article we use it to parallelize matrix matrix multiplication and show its scalability by a isoefficiency analysis. In addition, results based on a empirical analysis on two supercomputers are given. We achieve close-to-optimal performance wrt. theoretical peak performance. Based on this result we conclude that FooPar allows to fully access Scala's design features without suffering from performance drops when compared to implementations purely based on C and MPI

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Southern Denmark Research Output

Scalability study of virtual parallel processing systems

Author: Rocha Fábio Soares
Publication venue: [s.n.]
Publication date: 16/08/2018
Field of study

Orientador: Marco Aurélio Amaral HenriquesDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de ComputaçãoResumo: A utilização de sistemas de processamento paralelo virtual tem aumentado em várias áreas de aplicação, desde as estritamente matemáticas até as médicas e biológicas. Devido a esse crescimento, são cada vez mais necessários mecanismos para uma avaliação consistente de desempenho de sistemas desse tipo. É necessário também haver um entendimento mais preciso do conceito de escalabilidade, um dos principais indicadores de desempenho, tanto quanto ao seu significado como quanto à maneira de mensurá-la. Este trabalho traz um estudo comparativo sobre métricas de escalabilidade para sistemas de processamento paralelo homogêneos e heterogêneos, onde duas métricas homogêneas foram selecionadas para serem utilizadas na avaliação do limite superior de escalabilidade de duas plataformas de processamento paralelo virtual (JoiN a JPVM). A partir deste estudo foi proposta uma métrica de escalabilidade que utiliza como base para a sua análise o speedup, conceito muito familiar em processamento paralelo. Foram realizados testes de validação da métrica proposta, que destacam seu caráter prático e adequação para a aplicação em sistemas heterogêneos de processamento paralelo virtualAbstract: The utilization of Virtual Parallel Processing Systems has increased in several application areas, from strictly mathematical to medical and biological areas. Due to this increasing, mechanisms that offer a consistent performance evaluation of these systems, became more important and useful. It is also important to understand the concept of scalability, one of the main performance pointers, and the way of measuring it. This work shows a comparative study about scalability metrics of homogeneous and heterogeneous parallel processing systems, where two of the homogeneous metrics, were selected to evaluate the scalability upper limit on two virtual parallel processing platforms (JoiN and JPVM). As a result, We proposed a new scalability metric, which is based on the speedup, a well known parallel processing concept. Validation tests were performed using the proposed metric, which highlight its characteristics and suitability to heterogeneous virtual parallel processing systemsMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio da Producao Cientifica e Intelectual da Unicamp

ACACES 2011 poster abstracts: July 13, 2011: Fiuggi, Italy

Author: De Bosschere Koen
Publication venue: Academia Press
Publication date: 01/01/2011
Field of study

Ghent University Academic Bibliography

Performance evaluation of multi-core multi-cluster architecture

Author: Hamid Norhazlina
Walters Robert John
Wills Gary Brian
Publication venue
Publication date: 03/04/2014
Field of study

A multi-core cluster is a cluster composed of numbers of nodes where each node has a number of processors, each with more than one core within each single chip. Cluster nodes are connected via an interconnection network. Multi-cored processors are able to achieve higher performance without driving up power consumption and heat, which is the main concern in a single-core processor. A general problem in the network arises from the fact that multiple messages can be in transit at the same time on the same network links. This paper considers the communication latencies of a multi-core multi-cluster architecture will be investigated using simulation experiments and measurements under various working conditions

Southampton (e-Prints Soton)