Search CORE

164 research outputs found

The work/exchange model: a generalized approach to dynamic load balancing

Author: Wikstrom Milton Curtis
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/1991
Field of study

A crucial concern in software development is reducing program execution time. Parallel processing is often used to meet this goal. However, parallel processing efforts can lead to many pitfalls and problems. One such problem is to distribute the workload among processors in such a way that minimum execution time is obtained. The common approach is to use a load balancer to distribute equal or nearly equal quantities of workload on each processor. Unfortunately, this approach relies on a naive definition of load imbalance and often fails to achieve the desired goal. A more sophisticated definition should account for the affects of additional factors including communication delay costs, network contention, and architectural issues. Consideration of additional factors led us to the realization that optimal load distribution does not always result from equal load distribution. In this dissertation, we tackle the difficult problem of defining load imbalance. This is accomplished through the development of a parallel program model called the Generalized Work/Exchange Model. Associated with the model are equations for a restricted set of deterministically balanced programs that characterize idle time, elapsed time, and potential speedup. With the aid of the model, several common myths about load imbalance are exposed. A useful application called a load balancer enhancer is also presented which is applicable to the more general, quasi-static load unbalanced program

Digital Repository @ Iowa State University (ISU)

Adaptive Asynchronous Control and Consistency in Distributed Data Exploration Systems

Author: Ring Benjamin A
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 22/05/2018
Field of study

Advances in machine learning and streaming systems provide a backbone to transform vast arrays of raw data into valuable information. Leveraging distributed execution, analysis engines can process this information effectively within an iterative data exploration workflow to solve problems at unprecedented rates. However, with increased input dimensionality, a desire to simultaneously share and isolate information, as well as overlapping and dependent tasks, this process is becoming increasingly difficult to maintain. User interaction derails exploratory progress due to manual oversight on lower level tasks such as tuning parameters, adjusting filters, and monitoring queries. We identify human-in-the-loop management of data generation and distributed analysis as an inhibiting problem precluding efficient online, iterative data exploration which causes delays in knowledge discovery and decision making. The flexible and scalable systems implementing the exploration workflow require semi-autonomous methods integrated as architectural support to reduce human involvement. We, thus, argue that an abstraction layer providing adaptive asynchronous control and consistency management over a series of individual tasks coordinated to achieve a global objective can significantly improve data exploration effectiveness and efficiency. This thesis introduces methodologies which autonomously coordinate distributed execution at a lower level in order to synchronize multiple efforts as part of a common goal. We demonstrate the impact on data exploration through serverless simulation ensemble management and multi-model machine learning by showing improved performance and reduced resource utilization enabling a more productive semi-autonomous exploration workflow. We focus on the specific genres of molecular dynamics and personalized healthcare, however, the contributions are applicable to a wide variety of domains

JScholarship

Communication Patterns for Randomized Algorithms

Author: WASTELL CHRISTOPHER,MICHAEL
Publication venue
Publication date: 01/01/2018
Field of study

Examples of large scale networks include the Internet, peer-to-peer networks, parallel computing systems, cloud computing systems, sensor networks, and social networks. Efficient dissemination of information in large networks such as these is a funda- mental problem. In many scenarios the gathering of information by a centralised controller can be impractical. When designing and analysing distributed algorithms we must consider the limitations imposed by the heterogeneity of devices in the networks. Devices may have limited computational ability or space. This makes randomised algorithms attractive solutions. Randomised algorithms can often be simpler and easier to implement than their deterministic counterparts. This thesis analyses the effect of communication patterns on the performance of distributed randomised algorithms. We study randomized algorithms with application to three different areas. Firstly, we study a generalization of the balls-into-bins game. Balls into bins games have been used to analyse randomised load balancing. Under the Greedy[d] allocation scheme each ball queries the load of d random bins and is then allocated to the least loaded of them. We consider an infinite, parallel setting where expectedly λn balls are allocated in parallel according to the Greedy[d] allocation scheme in to n bins and subsequently each non-empty bin removes a ball. Our results show that for d = 1,2, the Greedy[d] allocation scheme is self-stabilizing and that in any round the maximum system load for high arrival rates is exponentially smaller for d = 2 compared to d = 1 (w.h.p). Secondly, we introduce protocols that solve the plurality consensus problem on arbitrary graphs for arbitrarily small bias. Typically, protocols depend heavily on the employed communication mechanism. Our protocols are based on an interest- ing relationship between plurality consensus and distributed load balancing. This relationship allows us to design protocols that are both time and space efficient and generalize the state of the art for a large range of problem parameters. Finally, we investigate the effect of restricting the communication of the classical PULL algorithm for randomised rumour spreading. Rumour spreading (broadcast) is a fundamental task in distributed computing. Under the classical PULL algo- rithm, a node with the rumour that receives multiple requests is able to respond to all of them in a given round. Our model restricts nodes such that they can re- spond to at most one request per round. Our results show that the restricted PULL algorithm is optimal for several graph classes such as complete graphs, expanders, random graphs and several Cayley graphs

Durham e-Theses

Recommended from our members

Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction

Author: Qi Hao
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Diabetes has become one of the most severe deceases due to an increasing number of diabetes patients globally. A large amount of digital data on diabetes has been collected through various channels. How to utilize these data sets to help doctors to make a decision on diagnosis, treatment and prediction of diabetic patients poses many challenges to the research community. The thesis investigates mathematical models with a focus on neural networks for large scale diabetes data modelling and analysis by utilizing modern computing technologies such as grid computing and cloud computing. These computing technologies provide users with an inexpensive way to have access to extensive computing resources over the Internet for solving data and computationally intensive problems. This thesis evaluates the performance of seven representative machine learning techniques in classification of diabetes data and the results show that neural network produces the best accuracy in classification but incurs high overhead in data training. As a result, the thesis develops MRNN, a parallel neural network model based on the MapReduce programming model which has become an enabling technology in support of data intensive applications in the clouds. By partitioning the diabetic data set into a number of equally sized data blocks, the workload in training is distributed among a number of computing nodes for speedup in data training. MRNN is first evaluated in small scale experimental environments using 12 mappers and subsequently is evaluated in large scale simulated environments using up to 1000 mappers. Both the experimental and simulations results have shown the effectiveness of MRNN in classification, and its high scalability in data training. MapReduce does not have a sophisticated job scheduling scheme for heterogonous computing environments in which the computing nodes may have varied computing capabilities. For this purpose, this thesis develops a load balancing scheme based on genetic algorithms with an aim to balance the training workload among heterogeneous computing nodes. The nodes with more computing capacities will receive more MapReduce jobs for execution. Divisible load theory is employed to guide the evolutionary process of the genetic algorithm with an aim to achieve fast convergence. The proposed load balancing scheme is evaluated in large scale simulated MapReduce environments with varied levels of heterogeneity using different sizes of data sets. All the results show that the genetic algorithm based load balancing scheme significantly reduce the makespan in job execution in comparison with the time consumed without load balancing.This work is funded by the EPSRC and China Market Association

Brunel University Research Archive

ProperCAD II: A Run-Time Library for Portable, Parallel, Object-Oriented Programming with Applications to VLSI CAD

Author: Banerjee Prithviraj
Chandy John A.
Parkes Steven
Publication venue: Center for Reliable and High-Performance Computing, Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Publication date: 01/12/1993
Field of study

Coordinated Science Laboratory was formerly known as Control Systems LaboratorySemiconductor Research Corporation / grant 93-DP-10

Illinois Digital Environment for Access to Learning and Scholarship Repository

Designing peer-to-peer overlays:a small-world perspective

Author: Girdzijauskas Šarunas
Publication venue: Lausanne, EPFL
Publication date: 08/01/2009
Field of study

The Small-World phenomenon, well known under the phrase "six degrees of separation", has been for a long time under the spotlight of investigation. The fact that our social network is closely-knitted and that any two people are linked by a short chain of acquaintances was confirmed by the experimental psychologist Stanley Milgram in the sixties. However, it was only after the seminal work of Jon Kleinberg in 2000 that it was understood not only why such networks exist, but also why it is possible to efficiently navigate in these networks. This proved to be a highly relevant discovery for peer-to-peer systems, since they share many fundamental similarities with the social networks; in particular the fact that the peer-to-peer routing solely relies on local decisions, without the possibility to invoke global knowledge. In this thesis we show how peer-to-peer system designs that are inspired by Small-World principles can address and solve many important problems, such as balancing the peer load, reducing high maintenance cost, or efficiently disseminating data in large-scale systems. We present three peer-to-peer approaches, namely Oscar, Gravity, and Fuzzynet, whose concepts stem from the design of navigable Small-World networks. Firstly, we introduce a novel theoretical model for building peer-to-peer systems which supports skewed node distributions and still preserves all desired properties of Kleinberg's Small-World networks. With such a model we set a reference base for the design of data-oriented peer-to-peer systems which are characterized by non-uniform distribution of keys as well as skewed query or access patterns. Based on this theoretical model we introduce Oscar, an overlay which uses a novel scalable network sampling technique for network construction, for which we provide a rigorous theoretical analysis. The simulations of our system validate the developed theory and evaluate Oscar's performance under typical conditions encountered in real-life large-scale networked systems, including participant heterogeneity, faults, as well as skewed and dynamic load-distributions. Furthermore, we show how by utilizing Small-World properties it is possible to reduce the maintenance cost of most structured overlays by discarding a core network connectivity element – the ring invariant. We argue that reliance on the ring structure is a serious impediment for real life deployment and scalability of structured overlays. We propose an overlay called Fuzzynet, which does not rely on the ring invariant, yet has all the functionalities of structured overlays. Fuzzynet takes the idea of lazy overlay maintenance further by eliminating the need for any explicit connectivity and data maintenance operations, relying merely on the actions performed when new Fuzzynet peers join the network. We show that with a sufficient amount of neighbors, even under high churn, data can be retrieved in Fuzzynet with high probability. Finally, we show how peer-to-peer systems based on the Small-World design and with the capability of supporting non-uniform key distributions can be successfully employed for large-scale data dissemination tasks. We introduce Gravity, a publish/subscribe system capable of building efficient dissemination structures, inducing only minimal dissemination relay overhead. This is achieved through Gravity's property to permit non-uniform peer key distributions which allows the subscribers to be clustered close to each other in the key space where data dissemination is cheap. An extensive experimental study confirms the effectiveness of our system under realistic subscription patterns and shows that Gravity surpasses existing approaches in efficiency by a large margin. With the peer-to-peer systems presented in this thesis we fill an important gap in the family of structured overlays, bringing into life practical systems, which can play a crucial role in enabling data-oriented applications distributed over wide-area networks

Infoscience - École polytechnique fédérale de Lausanne

Recommended from our members

The Impact of Randomisation in Load Balancing and Random Walks

Author: Cai Leran
Publication venue: University of Cambridge
Publication date: 18/08/2021
Field of study

The real world is full of uncertainties. Classical analyses usually favour deterministic cases, which in practice can be too restricted. Hence it motivates us to add in randomness to make models similar to practical situations. In this thesis, we mainly study two network problems taken from the distributed computing world: iterative load balancing and random walks. An interesting observation is that the problems we study, though not quite related regarding their real world applications, can be linked by the same mathematical toolkit: Markov chain theory. These problems have been heavily studied in the literature. However, their assumptions are mostly \emph{deterministic}, which causes less flexibility and generality to the real world settings. The novelty of this thesis is that we add randomness in these problems in order to observe worst cases vs. average cases (load balancing) and static cases vs. dynamic cases (random walks). For iterative load balancing, the randomness is added on the number of tasks over the entire network. Previous works often assumed worst case initial loads, which may be wasteful sometimes. Hence we relax this condition and assume the loads are drawn from different probability distributions. In particular, we no longer assume the initial loads are chosen by an adversary. Instead, we assume the initial loads on each processor are sampled from independent and identically distributed (i.i.d.) probability distributions. We then study the same problems as in classical settings, i.e., the time needed for the load balancing process to reach a sufficiently small discrepancy. Our main result implies that under such a regime, the time required to balance a network can be much faster. An insightful observation is that the load discrepancy is proportional to the term

t^{-1/4}

where

t

is the time used to run the protocol. This implies two main improvements compared with previous works: first, when the initial discrepancy is the same, our regime can reach small discrepancy faster; second, we have established a connection between the time and the discrepancy while previous analyses do not have. For random walks, the randomness is added on the network topologies. This means at each time step (considering discrete times), the underlying network can change randomly. In particular, we want the graph ``evolves'' instead of changing arbitrarily. To model the graph changing process, we adopt a model commonly used in the literature, i.e., the edge-Markovian model. If an edge does not exist between the two nodes, then it will appear in the next step with probability

p

, and if it does then in the next step it will disappear with probability

q

. This model can simulate real world scenarios such as adding friends with each other in social networks or a disruption between two remotely connected computers. Our main contributions regarding random walks include the following results. First, we divided the edge-Markovian graph model into different regimes in a parameterised way. This provides an intuitive path to similar analyses of dynamic graph models. Dynamic models are often hard to analyse in the field because of its complicated nature. We present a possible strategy to reach some feasible solutions by using parameters (

p, q

above) to control the process. Second, we again analyse the random walk behaviours on such models. We have found that under certain regimes, the random walk still shows similar behaviours especially its mixing nature as in static settings. For the other regimes, we also show either weaker mixing or no mixing results

Apollo (Cambridge)

Parallelization of Goal-Driven, Production Systems on Hypercube Machines in a C Environment.

Author: Shrivastava Rajendra Kumar
Publication venue: LSU Digital Commons
Publication date: 01/01/1990
Field of study

Production systems are widely used in artificial intelligence to capture the notion of expertise in modeling expert systems. Production systems are computationally intensive programs spending most of the execution time in their MATCH or recognise phase. Efforts have been made by the research in this dissertation to minimize the production system\u27s execution time by optimizing the MATCH phase. Goal oriented deterministic production systems are commonly used for robotics applications and formed the main class of production systems that were studied in this dissertation. The main motivation for the research was to provide a better MATCH algorithm and use the multiprocessing capabilities of existing parallel computer hardware. The dissertation realizes these goals by transforming a traditional production system\u27s scalar equivalence operations into C arithmetic hashing function to generate an indexing variable for the switch-case construct of the C language. Partitioning of the working memory into homogeneous blocks and distributing production memory over the multiprocessors enhanced the MIMD operation of the production system. A scheme is formulated and implemented to identify a few key condition elements that may be used as an indexing variable and reduce the number of condition elements used in the MATCH phase. The complete translation from OPS5 code to C and the implementation scheme is presented in this dissertation. Various issues regarding the distribution of the inference engine over the multiprocessor environment and other related synchronization topics for distributed systems are covered in the dissertation. A detailed description of the parallel computer\u27s simulator is also provided in the dissertation. The dissertation identifies other research topics and problems related to parallelization of production systems, the most significant being the ability to incorporate LEARNING in production systems by using one or all of the idle processors that are waiting for the active processor to complete it\u27s activities

Louisiana State University

Spatial reaction systems on parallel supercomputers

Author: Smith Mark
Publication venue: The University of Edinburgh
Publication date: 01/01/1994
Field of study

Edinburgh Research Archive

Computing resources sensitive parallelization of neural neworks for large scale diabetes data modelling, diagnosis and prediction

Author: Li M
Qi Hao
Publication venue
Publication date: 01/01/2011
Field of study

Diabetes has become one of the most severe deceases due to an increasing number of diabetes patients globally. A large amount of digital data on diabetes has been collected through various channels. How to utilize these data sets to help doctors to make a decision on diagnosis, treatment and prediction of diabetic patients poses many challenges to the research community. The thesis investigates mathematical models with a focus on neural networks for large scale diabetes data modelling and analysis by utilizing modern computing technologies such as grid computing and cloud computing. These computing technologies provide users with an inexpensive way to have access to extensive computing resources over the Internet for solving data and computationally intensive problems. This thesis evaluates the performance of seven representative machine learning techniques in classification of diabetes data and the results show that neural network produces the best accuracy in classification but incurs high overhead in data training. As a result, the thesis develops MRNN, a parallel neural network model based on the MapReduce programming model which has become an enabling technology in support of data intensive applications in the clouds. By partitioning the diabetic data set into a number of equally sized data blocks, the workload in training is distributed among a number of computing nodes for speedup in data training. MRNN is first evaluated in small scale experimental environments using 12 mappers and subsequently is evaluated in large scale simulated environments using up to 1000 mappers. Both the experimental and simulations results have shown the effectiveness of MRNN in classification, and its high scalability in data training. MapReduce does not have a sophisticated job scheduling scheme for heterogonous computing environments in which the computing nodes may have varied computing capabilities. For this purpose, this thesis develops a load balancing scheme based on genetic algorithms with an aim to balance the training workload among heterogeneous computing nodes. The nodes with more computing capacities will receive more MapReduce jobs for execution. Divisible load theory is employed to guide the evolutionary process of the genetic algorithm with an aim to achieve fast convergence. The proposed load balancing scheme is evaluated in large scale simulated MapReduce environments with varied levels of heterogeneity using different sizes of data sets. All the results show that the genetic algorithm based load balancing scheme significantly reduce the makespan in job execution in comparison with the time consumed without load balancing.EThOS - Electronic Theses Online ServiceEPSRCChina Market AssociationGBUnited Kingdo

OpenGrey Repository