148 research outputs found
The cost of conservative synchronization in parallel discrete event simulations
The performance of a synchronous conservative parallel discrete-event simulation protocol is analyzed. The class of simulation models considered is oriented around a physical domain and possesses a limited ability to predict future behavior. A stochastic model is used to show that as the volume of simulation activity in the model increases relative to a fixed architecture, the complexity of the average per-event overhead due to synchronization, event list manipulation, lookahead calculations, and processor idle time approach the complexity of the average per-event overhead of a serial simulation. The method is therefore within a constant factor of optimal. The analysis demonstrates that on large problems--those for which parallel processing is ideally suited--there is often enough parallel workload so that processors are not usually idle. The viability of the method is also demonstrated empirically, showing how good performance is achieved on large problems using a thirty-two node Intel iPSC/2 distributed memory multiprocessor
Product-form solutions for integrated services packet networks and cloud computing systems
We iteratively derive the product-form solutions of stationary distributions
of priority multiclass queueing networks with multi-sever stations. The
networks are Markovian with exponential interarrival and service time
distributions. These solutions can be used to conduct performance analysis or
as comparison criteria for approximation and simulation studies of large scale
networks with multi-processor shared-memory switches and cloud computing
systems with parallel-server stations. Numerical comparisons with existing
Brownian approximating model are provided to indicate the effectiveness of our
algorithm.Comment: 26 pages, 3 figures, short conference version is reported at MICAI
200
Decentralized load balancing in heterogeneous computational grids
With the rapid development of high-speed wide-area networks and powerful yet low-cost computational resources, grid computing has emerged as an attractive computing paradigm. The space limitations of conventional distributed systems can thus be overcome, to fully exploit the resources of under-utilised computing resources in every region around the world for distributed jobs. Workload and resource management are key grid services at the service level of grid software infrastructure, where issues of load balancing represent a common concern for most grid infrastructure developers. Although these are established research areas in parallel and distributed computing, grid computing environments present a number of new challenges, including large-scale computing resources, heterogeneous computing power, the autonomy of organisations hosting the resources, uneven job-arrival pattern among grid sites, considerable job transfer costs, and considerable communication overhead involved in capturing the load information of sites. This dissertation focuses on designing solutions for load balancing in computational grids that can cater for the unique characteristics of grid computing environments. To explore the solution space, we conducted a survey for load balancing solutions, which enabled discussion and comparison of existing approaches, and the delimiting and exploration of the apportion of solution space. A system model was developed to study the load-balancing problems in computational grid environments. In particular, we developed three decentralised algorithms for job dispatching and load balancing—using only partial information: the desirability-aware load balancing algorithm (DA), the performance-driven desirability-aware load-balancing algorithm (P-DA), and the performance-driven region-based load-balancing algorithm (P-RB). All three are scalable, dynamic, decentralised and sender-initiated. We conducted extensive simulation studies to analyse the performance of our load-balancing algorithms. Simulation results showed that the algorithms significantly outperform preexisting decentralised algorithms that are relevant to this research
Distributed and Multiprocessor Scheduling
This chapter discusses CPU scheduling in parallel and distributed systems. CPU scheduling is part of a broader class of resource allocation problems, and is probably the most carefully studied such problem. The main motivation for multiprocessor scheduling is the desire for increased speed in the execution of a workload. Parts of the workload, called tasks, can be spread across several processors and thus be executed more quickly than on a single processor. In this chapter, we will examine techniques for providing this facility. The scheduling problem for multiprocessor systems can be generally stated as \How can we execute a set of tasks T on a set of processors P subject to some set of optimizing criteria C? The most common goal of scheduling is to minimize the expected runtime of a task set. Examples of other scheduling criteria include minimizing the cost, minimizing communication delay, giving priority to certain users\u27 processes, or needs for specialized hardware devices. The scheduling policy for a multiprocessor system usually embodies a mixture of several of these criteria. Section 2 outlines general issues in multiprocessor scheduling and gives background material, including issues specific to either parallel or distributed scheduling. Section 3 describes the best practices from prior work in the area, including a broad survey of existing scheduling algorithms and mechanisms. Section 4 outlines research issues and gives a summary. Section 5 lists the terms defined in this chapter, while sections 6 and 7 give references to important research publications in the area
Recommended from our members
Piece-wise scheduling of composite task graphs onto distributed memory parallel computers
Heuristics for static scheduling of task graphs using list scheduling techniques have continued to improve by adding real-world factors such as processor speed, network transmission speed, interconnection topology, and link contention considerations to the basic task graph model. Yet, the resulting schedules do not fully model program loops and branches, startup costs for both process creation and message initiation, and a number of interesting parallel processing patterns such as meshes, tress, and supervisor/workers. In fact, improvements in the schedule may be obtained when the task graph is regular as when it contains repeated or replicated tasks, divide-and-conquer patterns of communication, or a mesh-structured pattern of computation. In this paper we describe a limited approach to scheduling composite task graphs that considers process and message startup costs, and three regular patterns : replicated, tree, and mesh. The approach is to model programs with such regular patterns as a composite task graph, where each regular structure is a decomposable sub-task node in the task graph. Then, we compute an optimal schedule for each sub-task. graph, piece the sub-tasks together, and perform an ordinary static scheduling heuristic on the pieces, to produce an overall schedule. We define a composite task graph as a hierarchical task graph containing regular-structured sub-task graphs as components. At the top level of this hierarchy, each graph node represents either a simple task or a hierarchically decomposable sub-task graph. We propose a piece-wise scheduling algorithm that simply allocates processors to sub-task graphs according to closed-form expressions which give determine the optimal number of processors, and then uses a list scheduling algorithm to schedule the flattened graph onto these processors. We do not address the pressing problem of loops and branches in the task graph representation, but we speculate that the technique of piece-wise scheduling introduced here can be adapted to a hybrid form of scheduling that may accommodate branches and loops. Piece-wise scheduling is not guaranteed to yield the best global schedule. Rather, it pieces together locally optimum sub-schedules. Finding globally optimum schedules for composite task graphs remains an open problem. We present an heuristic approach that has been experimentally used to schedule small parallel programs with encouraging results. More empirical evidence is needed to determine the usefulness of this technique, but early indications are encouraging
Multi-processor task scheduling with maximum tardiness criteria.
by Wong Tin-Lam.Thesis (M.Phil.)--Chinese University of Hong Kong, 1998.Includes bibliographical references (leaves 70-73).Abstract --- p.iiAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Scheduling Problems --- p.1Chapter 1.2 --- Literature Review --- p.4Chapter 1.2.1 --- Sized Multiprocessor Task Scheduling --- p.5Chapter 1.2.2 --- Fixed Multiprocessor Task Scheduling --- p.6Chapter 1.2.3 --- Set Multiprocessor Task Scheduling --- p.8Chapter 1.3 --- Organization of Thesis --- p.10Chapter 2 --- Overview --- p.11Chapter 2.1 --- Machine Environment --- p.11Chapter 2.2 --- The Jobs and Their Requirements --- p.12Chapter 2.3 --- Assumptions --- p.13Chapter 2.4 --- Constraints --- p.14Chapter 2.5 --- Objective --- p.15Chapter 2.6 --- An Illustrative Example --- p.17Chapter 2.7 --- NP-Hardness --- p.20Chapter 3 --- Methodology --- p.22Chapter 3.1 --- Dynamic Programming --- p.22Chapter 3.1.1 --- Problem Analysis --- p.24Chapter 3.2 --- Key Idea to solve the problem --- p.27Chapter 3.3 --- Algorithm --- p.28Chapter 3.3.1 --- Phase 1 --- p.28Chapter 3.3.2 --- Phase 2 --- p.37Chapter 4 --- Extensions --- p.46Chapter 4.1 --- "Polynomially Solvable Cases P2 --- p.46Chapter 4.1.1 --- Dynamic Programming --- p.47Chapter 4.2 --- "Set Problem P2/setj,prmp/TmaX" --- p.55Chapter 4.2.1 --- Processing times for set jobs --- p.56Chapter 4.2.2 --- Algorithm --- p.58Chapter 4.3 --- k´ؤMachine Problem with only two types of jobs --- p.64Chapter 5 --- Conclusion and Future Work --- p.67Chapter 5.1 --- Conclusion --- p.67Chapter 5.2 --- Some Future Work --- p.68Bibliography --- p.7
- …