Search CORE

1,489 research outputs found

Overlay networks for smart grids

Author: De Turck Filip
Develder Chris
Wauters Tim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems

Author: Bergman K
Chandramowlishwaran A
Hamada T
Lorena A Barba
Rahimian A
Rio Yokota
Warren M
Yokota R
Publication venue: 'SAGE Publications'
Publication date: 16/10/2011
Field of study

Among the algorithms that are likely to play a major role in future exascale computing, the fast multipole method (FMM) appears as a rising star. Our previous recent work showed scaling of an FMM on GPU clusters, with problem sizes in the order of billions of unknowns. That work led to an extremely parallel FMM, scaling to thousands of GPUs or tens of thousands of CPUs. This paper reports on a a campaign of performance tuning and scalability studies using multi-core CPUs, on the Kraken supercomputer. All kernels in the FMM were parallelized using OpenMP, and a test using 10^7 particles randomly distributed in a cube showed 78% efficiency on 8 threads. Tuning of the particle-to-particle kernel using SIMD instructions resulted in 4x speed-up of the overall algorithm on single-core tests with 10^3 - 10^7 particles. Parallel scalability was studied in both strong and weak scaling. The strong scaling test used 10^8 particles and resulted in 93% parallel efficiency on 2048 processes for the non-SIMD code and 54% for the SIMD-optimized code (which was still 2x faster). The weak scaling test used 10^6 particles per process, and resulted in 72% efficiency on 32,768 processes, with the largest calculation taking about 40 seconds to evaluate more than 32 billion unknowns. This work builds up evidence for our view that FMM is poised to play a leading role in exascale computing, and we end the paper with a discussion of the features that make it a particularly favorable algorithm for the emerging heterogeneous and massively parallel architectural landscape

arXiv.org e-Print Archive

Crossref

DECENTRALIZED RESOURCE ORCHESTRATION FOR HETEROGENEOUS GRIDS

Author: Lee Jaehwan
Publication venue
Publication date: 01/01/2012
Field of study

Modern desktop machines now use multi-core CPUs to enable improved performance. However, achieving high performance on multi-core machines without optimized software support is still difficult even in a single machine, because contention for shared resources can make it hard to exploit multiple computing resources efficiently. Moreover, more diverse and heterogeneous hardware platforms (e.g. general-purpose GPU and Cell processors) have emerged and begun to impact grid computing. Given that heterogeneity and diversity are now a major trend going forward, grid computing must support these environmental changes. In this dissertation, I design and evaluate a decentralized resource management scheme to exploit heterogeneous multiple computing resources effectively. I suggest resource management algorithms that can efficiently utilize a diverse computational environment, including multiple symmetric computing entities and heterogeneous multi-computing entities, and achieve good load-balancing and high total system throughput. Moreover, I propose expressive resource description techniques to accommodate more heterogeneous environments, allowing incoming jobs with complex requirements to be matched to available resources. First, I develop decentralized resource management frameworks and job scheduling schemes to exploit multi-core nodes in peer-to-peer grids. I present two new load-balancing schemes that explicitly account for resource sharing and contention across multiple cores within a single machine, and propose a simple performance prediction model that can represent a continuum of resource sharing among cores of a CPU. Second, I provide scalable resource discovery and load balancing techniques to accommodate nodes with many types of computing elements, such as multi-core CPUs and GPUs, in a peer-to-peer grid architecture. My scheme takes into account diverse aspects of heterogeneous nodes to maximize overall system throughput as well as minimize messaging costs without sacrificing the failure resilience provided by an underlying peer-to-peer overlay network. Finally, I propose an expressive resource discovery method to support multi-attribute, range-based job constraints. The common approach of using simple attribute indexes does not suffice, as range-based constraints may be satisfied by more than a single value. I design a compact ID-based representation for resource characteristics, and integrate this representation into the decentralized resource discovery framework. By extensive experimental results via simulation, I show that my schemes can match heterogeneous jobs to heterogeneous resources both effectively (good matches are found, load is balanced), and efficiently (the new functionality imposes little overhead)

Digital Repository at the University of Maryland

High performance subgraph mining in molecular compounds

Author: M.J. Zaki
O. Weislow
R. Finkel
T. Washio
Y. Chung
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations

KOPS - The Institutional Repository of the University of Konstanz

Central Archive at the University of Reading

Crossref

Peer to Peer Information Retrieval: An Overview

Author: Hiemstra Djoerd
Tigelaar Almer S.
Trieschnigg Dolf
Publication venue: ACM
Publication date: 01/01/2012
Field of study

Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

Radboud Repository

University of Twente Research Information

Coordinated Self-Adaptation in Large-Scale Peer-to-Peer Overlays

Author: Napper J.M.
Pierre G.E.O.
Sacha J.
Stratan C.
Publication venue: Amsterdam, the Netherlands
Publication date: 01/01/2010
Field of study

Self-adaptive systems typically rely on a closed control loop which detects when the current behavior deviates too much from the optimal one, determines new optimal values for system parameters, and applies changes to the system configuration. In decentralized systems, implementing each of these steps is challenging, especially when nodes need to coordinate their local configurations. In this paper, we propose a decentralized method to automatically tune global system parameters in a coordinated manner. We use gossip-based protocols to continuously monitor system properties and to disseminate parameter updates. We show that this method applied to a decentralized resource selection service allows the system to quickly adapt to changes in workload types and node properties, and only incurs a negligible communication overhead

VU Research Portal