Search CORE

1,123 research outputs found

Scheduling of Divisible Loads on Heterogeneous Distributed Systems

Author: Abhay Ghatpande
Hidenori Nakazato
Olivier Beaumont
Publication venue: 'IntechOpen'
Publication date: 01/01/2010
Field of study

IntechOpen

Ishu bunsan shisutemu ni okeru kabun tasuku no sukejulingu

Author: Ghatpande Abhay
Publication venue
Publication date: 01/01/2008
Field of study

制度:新 ; 報告番号:甲2691号 ; 学位の種類:博士(国際情報通信学) ; 授与年月日:2008/7/30 ; 早大学位記番号:新486

Waseda University Repository

Divisible load scheduling of image processing applications on the heterogeneous star and tree networks using a new genetic algorithm

Author: Bagherzadeh N
Nikbakht Aali S
Publication venue: eScholarship, University of California
Publication date: 25/05/2020
Field of study

The divisible load scheduling of image processing applications on the heterogeneous star and multi-level tree networks is addressed in this paper. In our platforms, processors and network links have different speeds. In addition, computation and communication overheads are considered. A new genetic algorithm for minimizing the processing time of low-level image applications using divisible load theory is introduced. The closed-form solution for the processing time, the image fractions that should be allocated to each processor, the optimum number of participating processors, and the optimal sequence for load distribution are derived. The new concept of equivalent processor in tree network is introduced and the effect of different image and kernel sizes on processing time and speed up are investigated. Finally, to indicate the efficiency of our algorithm, several numerical experiments are presented

Crossref

eScholarship - University of California

Agent-Based Load Balancing on Homogeneous Minigrids: Macroscopic Modeling and Characterization

Author: Jiming Liu
Senior Member
Xiaolong Jin
Yuanshi Wang
Publication venue
Publication date: 01/01/2004
Field of study

Abstract—In this paper, we present a macroscopic characterization of agent-based load balancing in homogeneous minigrid environments. The agent-based load balancing is regarded as agent distribution from a macroscopic point of view. We study two quantities on minigrids: the number and size of teams where agents (tasks) queue. In macroscopic modeling, the load balancing mechanism is characterized using differential equations. We show that the load balancing we concern always converges to a steady state. Furthermore, we show that load balancing with different initial distributions converges to the same steady state gradually. Also, we prove that the steady state becomes an even distribution if and only if agents have complete knowledge about agent teams on minigrids. Utility gains and efficiency are introduced to measure the quality of load balancing. Through numerical simulations, we discuss the utility gains and efficiency of load balancing in different cases and give a series of analysis. In order to maximize the utility gain and the efficiency, we theoretically discuss the optimization of agents ’ strategies. Finally, in order to validate our proposed agentbased load balancing mechanism, we develop a computing platform, called Simulation System for Grid Task Distribution (SSGTD). Through experimentation, we note that our experimental results in general confirm our theoretical proofs and numerical simulation results on the proposed equation system. In addition, we find a very interesting phenomenon, that is, our agent-based load balancing mechanism is topology-independent

CiteSeerX

Recommended from our members

High performance latent dirichlet allocation for text mining

Author: Liu Zelong
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Latent Dirichlet Allocation (LDA), a total probability generative model, is a three-tier Bayesian model. LDA computes the latent topic structure of the data and obtains the significant information of documents. However, traditional LDA has several limitations in practical applications. LDA cannot be directly used in classification because it is a non-supervised learning model. It needs to be embedded into appropriate classification algorithms. LDA is a generative model as it normally generates the latent topics in the categories where the target documents do not belong to, producing the deviation in computation and reducing the classification accuracy. The number of topics in LDA influences the learning process of model parameters greatly. Noise samples in the training data also affect the final text classification result. And, the quality of LDA based classifiers depends on the quality of the training samples to a great extent. Although parallel LDA algorithms are proposed to deal with huge amounts of data, balancing computing loads in a computer cluster poses another challenge. This thesis presents a text classification method which combines the LDA model and Support Vector Machine (SVM) classification algorithm for an improved accuracy in classification when reducing the dimension of datasets. Based on Density-Based Spatial Clustering of Applications with Noise (DBSCAN), the algorithm automatically optimizes the number of topics to be selected which reduces the number of iterations in computation. Furthermore, this thesis presents a noise data reduction scheme to process noise data. When the noise ratio is large in the training data set, the noise reduction scheme can always produce a high level of accuracy in classification. Finally, the thesis parallelizes LDA using the MapReduce model which is the de facto computing standard in supporting data intensive applications. A genetic algorithm based load balancing algorithm is designed to balance the workloads among computers in a heterogeneous MapReduce cluster where the computers have a variety of computing resources in terms of CPU speed, memory space and hard disk space

Brunel University Research Archive

The effect of start-up delays in scheduling divisible loads on bus networks: An alternate approach

Author: Mani V.
Omkar S.N.
Suresh S.
Publication venue: Published by Elsevier Ltd.
Publication date: 31/12/2003
Field of study

AbstractIn this paper, scheduling of divisible loads in a bus network is considered. The objective is to minimize the processing time by including the overhead component due to start-up time that could degrade the performance of the system, in addition to the inherent communication and computation delays. These overheads are considered to be constant additive factors to the communication and computation components. A closed-form expression for optimal processing time is derived. Using this closed-form expression, this paper analytically proves significant results regarding the optimal sequence of load distribution and optimal number of processors. Numerical examples are presented to illustrate the analysis

Elsevier - Publisher Connector

Adaptive structured parallelism

Author: González Vélez Horacio
Publication venue: The University of Edinburgh
Publication date: 01/01/2008
Field of study

Algorithmic skeletons abstract commonly-used patterns of parallel computation, communication, and interaction. Parallel programs are expressed by interweaving parameterised skeletons analogously to the way in which structured sequential programs are developed, using well-defined constructs. Skeletons provide top-down design composition and control inheritance throughout the program structure. Based on the algorithmic skeleton concept, structured parallelism provides a high-level parallel programming technique which allows the conceptual description of parallel programs whilst fostering platform independence and algorithm abstraction. By decoupling the algorithm specification from machine-dependent structural considerations, structured parallelism allows programmers to code programs regardless of how the computation and communications will be executed in the system platform.Meanwhile, large non-dedicated multiprocessing systems have long posed a challenge to known distributed systems programming techniques as a result of the inherent heterogeneity and dynamism of their resources. Scant research has been devoted to the use of structural information provided by skeletons in adaptively improving program performance, based on resource utilisation. This thesis presents a methodology to improve skeletal parallel programming in heterogeneous distributed systems by introducing adaptivity through resource awareness. As we hypothesise that a skeletal program should be able to adapt to the dynamic resource conditions over time using its structural forecasting information, we have developed ASPara: Adaptive Structured Parallelism. ASPara is a generic methodology to incorporate structural information at compilation into a parallel program, which will help it to adapt at execution

Edinburgh Research Archive

Revisiting Matrix Product on Master-Worker Platforms

Author: Dongarra Jack
Laboratoire de l'informatique du parallélisme
Pineau Jean-François
Robert Yves
Shi Zhiao
Vivien Frédéric
Publication venue
Publication date: 01/01/2006
Field of study

This paper is aimed at designing efficient parallel matrix-product algorithms for heterogeneous master-worker platforms. While matrix-product is well-understood for homogeneous 2D-arrays of processors (e.g., Cannon algorithm and ScaLAPACK outer product algorithm), there are three key hypotheses that render our work original and innovative: - Centralized data. We assume that all matrix files originate from, and must be returned to, the master. - Heterogeneous star-shaped platforms. We target fully heterogeneous platforms, where computational resources have different computing powers. - Limited memory. Because we investigate the parallelization of large problems, we cannot assume that full matrix panels can be stored in the worker memories and re-used for subsequent updates (as in ScaLAPACK). We have devised efficient algorithms for resource selection (deciding which workers to enroll) and communication ordering (both for input and result messages), and we report a set of numerical experiments on various platforms at Ecole Normale Superieure de Lyon and the University of Tennessee. However, we point out that in this first version of the report, experiments are limited to homogeneous platforms

arXiv.org e-Print Archive

HAL-ENS-LYON

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Libre Acces aux Rapports Scientifiques et Techniques

The University of Manchester - Institutional Repository

Hal-Diderot

Parallel and Distributed Computing

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing

Directory of Open Access Books (DOAB)

Extending OmpSs for OpenCL kernel co-execution in heterogeneous systems

Author: Ayguadé Parra Eduard
Beivide Palacio Ramon
Bosque Jose L.
Martorell Bofill Xavier
Mateo Sergi
Pérez Borja
Stafford Esteban
Teruel Xavier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Heterogeneous systems have a very high potential performance but present difficulties in their programming. OmpSs is a well known framework for task based parallel applications, which is an interesting tool to simplify the programming of these systems. However, it does not support the co-execution of a single OpenCL kernel instance on several compute devices. To overcome this limitation, this paper presents an extension of the OmpSs framework that solves two main objectives: the automatic division of datasets among several devices and the management of their memory address spaces. To adapt to different kinds of applications, the data division can be performed by the novel HGuided load balancing algorithm or by the well known Static and Dynamic. All this is accomplished with negligible impact on the programming. Experimental results reveal that there is always one load balancing algorithm that improves the performance and energy consumption of the system.This work has been supported by the University of Cantabria with grant CVE-2014-18166, the Generalitat de Catalunya under grant 2014-SGR-1051, the Spanish Ministry of Economy, Industry and Competitiveness under contracts TIN2016- 76635-C2-2-R (AEI/FEDER, UE) and TIN2015-65316-P. The Spanish Government through the Programa Severo Ochoa (SEV-2015-0493). The European Research Council under grant agreement No 321253 European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc Projects, grant agreement n 288777, 610402 and 671697 and the European HiPEAC Network.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC