Search CORE

278 research outputs found

Simulation techniques in an artificial society model

Author: Lawson Barry Glenn
Publication venue: W&M ScholarWorks
Publication date: 01/01/2002
Field of study

Artificial society refers to a generic class of agent-based simulation models used to discover global social structures and collective behavior produced by simple local rules and interaction mechanisms. Artificial society models are applicable in a variety of disciplines, including the modeling of chemical and biological processes, natural phenomena, and complex adaptive systems. We focus on the underlying simulation techniques used in artificial society discrete-event simulation models, including model time evolution and computational performance.;Although for some applications synchronous time evolution is the correct modeling approach, many other applications are better represented using asynchronous time evolution. We claim that asynchronous time evolution can eliminate potential simulation artifacts produced using synchronous time evolution. Using an adaptation of a popular artificial society model, we show that very different output can result based solely on the choice of asynchronous or synchronous time evolution. Based on the event list implementation chosen, the use of discrete-event simulation to incorporate asynchronous time evolution can incur a substantial loss in computational performance. Accordingly, we evaluate select event list implementations within the artificial society simulation model and demonstrate that acceptable performance can be achieved.;In addition to the artificial society model, we show that transforming from a synchronous to an asynchronous system proves beneficial for scheduling resources in a parallel system. We focus on non-FCFS job scheduling policies that permit jobs to backfill, i.e., to move ahead in the queue, given that they do not delay certain previously submitted jobs. Instead of using a single queue of jobs, we propose a simple yet effective backfilling scheduling policy that effectively separates short from long jobs by incorporating multiple queues. By monitoring system performance, our policy adapts its configuration parameters in response to severe changes in the job arrival pattern and/or resource demands. Detailed performance comparisons via simulation using actual parallel workload traces indicate that our proposed policy consistently outperforms traditional backfilling in a variety of contexts

College of William & Mary: W&M Publish

ANALYZING SUPERCOMPUTER UTILIZATION UNDER QUEUING WITH A PRIORITY FORMULA AND A STRICT BACKFILL POLICY

Author: Vanderlan Michael David
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/05/2011
Field of study

Supercomputers have become increasingly important in recent years due to the growing amount of data available and the increasing demand for quicker results in the scientific community. Since supercomputers carry a high cost to build and maintain, efficiency becomes more important to the owners, administrators, and users of these supercomputers. One important factor in determining the efficiency of a supercomputer is the scheduling of jobs that are submitted by users of the system. Previous work has dealt with optimizing the schedule on the system’s end while the users are blinded from the process. The work presented in this thesis investigates a scheduling system that is implemented at the Oak Ridge National Laboratory (ORNL) supercomputer Kraken with a backfilling policy and attempts to outline the optimal methods from the user’s point of view in the scheduling system, along with using a simulation approach to optimize the priority formula. Normally the user has no idea which scheduling algorithms are used, but the users at ORNL not only know how the scheduling works but they can also view the current activity of the system. This gives an advantage to the users who are willing to benefit from this knowledge by utilizing some elementary game theory to optimize their strategies. The results will show a benefit to both the users, since they will be able to process their jobs sooner, and the system, since it will better utilized with little expense to the administrators, through competition. Queuing models and simulation have been well studied in almost all relevant aspects of the modern world. Higher efficiency is the goal of many researchers in several different fields; the supercomputer queues are no different. Efficient use of the resources makes the system administrator pleased while benefiting the users with more timely results. Studying these queuing models through simulation should help all parties involved by increasing utilization. The simulation will be validated and the utilization improvement will be measured and reported. User defined formulas will be developed for future users to help maximize utilization and minimize wait times

University of Tennessee, Knoxville: Trace

Adaptive space-time sharing with SCOJO.

Author: Huang Xuemin
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2004
Field of study

Coscheduling is a technique used to improve the performance of parallel computer applications under time sharing, i.e., to provide better response times than standard time sharing or space sharing. Dynamic coscheduling and gang scheduling are two main forms of coscheduling. In SCOJO (Share-based Job Coscheduling), we have introduced our own original framework to employ loosely coordinated dynamic coscheduling and a dynamic directory service in support of scheduling cross-site jobs in grid scheduling. SCOJO guarantees effective CPU shares by taking coscheduling effects into consideration and supports both time and CPU share reservation for cross-site job. However, coscheduling leads to high memory pressure and still involves problems like fragmentation and context-switch overhead, especially when applying higher multiprogramming levels. As main part of this thesis, we employ gang scheduling as more directly suitable approach for combined space-time sharing and extend SCOJO for clusters to incorporate adaptive space sharing into gang scheduling. We focus on taking advantage of moldable and malleable characteristics of realistic job mixes to dynamically adapt to varying system workloads and flexibly reduce fragmentation. In addition, our adaptive scheduling approach applies standard job-scheduling techniques like a priority and aging system, backfilling or easy backfilling. We demonstrate by the results of a discrete-event simulation that this dynamic adaptive space-time sharing approach can deliver better response times and bounded relative response times even with a lower multiprogramming level than traditional gang scheduling.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .H825. Source: Masters Abstracts International, Volume: 43-01, page: 0237. Adviser: A. Sodan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

Scholarship at UWindsor

Recommended from our members

Scheduling, Characterization and Prediction of HPC Workloads for Distributed Computing Environments

Author: Naghshnejad Mina
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

As High Performance Computing (HPC) has grown considerably and is expected to grow even more, effective resource management for distributed computing sys- tems is motivated more than ever. As the computational workloads grow in quantity, it is becoming more crucial to apply efficient resource management and workload scheduling to use resources efficiently while keeping the computational performance reasonably good. The problem of efficiently scheduling workloads on resources while meeting performance standards is hard. Additionally, non-clairvoyance of job dimen- sions makes resource management even harder in real-world scenarios. Our research methodology investigates the scheduling problem compliant for HPC and researches the challenges for deploying the scheduling in real world-scenarios using state of the art machine learning and data science techniques.To this end, this Ph.D. dissertation makes the following core contributions: a) We perform a theoretical analysis of space-sharing, non-preemptive scheduling: we studied this scheduling problem and proposed scheduling algorithms with polyno- mial computation time. We also proved constant upper-bounds for the performance of these algorithms. b) We studied the sensitivity of scheduling algorithms to the accuracy of runtime and devised a meta-learning approach to estimate prediction accuracy for newly submitted jobs to the HPC system. c) We studied the runtime prediction problem for HPC applications. For this purpose, we studied the distri- bution of available public workloads and proposed two different solutions that can predict multi-modal distributions: switching state-space models and Mixture Density Networks. d) We studied the effectiveness of recent recurrent neural network models for CPU usage trace prediction for individual VM traces as well as aggregate CPU usage traces. In this dissertation, we explore solutions to improve the performance of scheduling workloads on distributed systems.We begin by looking at the problem from the theoretical perspective. Modeling the problem mathematically, we first propose a scheduling algorithm that finds a constant approximation of the optimal solution for the problem in polynomial time. We prove that the performance of the algorithm (average completion time is the constant approximation of the performance of the optimal scheduling. We next look at the problem in real-world scenarios. Considering High-Performance Computing (HPC) workload computing environments as the most similar real-world equivalent of our mathematical model, we explore the problem of predicting application runtime. We propose an algorithm to handle the existing uncertainties in the real world and show-case our algorithm with demonstrative effectiveness in terms of response time and resource utilization. After looking at the uncertainty problem, we focus on trying to improve the accuracy of existing prediction approaches for HPC application runtime. We propose two solutions, one based on Kalman filters and one based on deep density mixture networks. We showcase the effectiveness of our prediction approaches by comparing with previous prediction approaches in terms of prediction accuracy and impact on improving scheduling performance. In the end, we focus on predicting resource usage for individual applications during their execution. We explore the application of recurrent neural networks for predicting resource usage of applications deployed on individual virtual machines. To validate our proposed models and solutions, we performed extensive trace-driven simulation and measured the effectiveness of our approaches

eScholarship - University of California

Backfilling with fairness and slack for parallel job scheduling

Author: Jin Wei
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2009
Field of study

Parallel jobs have different runtimes and numbers of threads/processes. Thus, scheduling parallel jobs involves a packing problem. If jobs are packed as tightly as possible, utilization will be improved. Otherwise, some resources have to stay idle. The common solution to deal with idle resources is backfilling, which schedule smaller jobs submitted later to execute earlier as long as they do not postpone the first job or all the previous jobs in the waiting queue. Traditionally, backfilling uses first fit for idle resources, according to the submission order. However, in this case, better packing of jobs could be missed. Hence, we propose an algorithm which looks further ahead if significantly improving utilization. However at the same time, this could be unfair to some jobs ahead in the queue. So we use a delay factor as a constraint to limit unfairness. We propose a branch and bound algorithm which selects jobs for backfilling which keep utilization high, while trying to stay close to First-Come-First-Served (FCFS). We evaluate relative response time and utilization and compare to other backfilling approaches. The selection of jobs for backfilling to optimize for high utilization and low delay is implemented as an extension of the existing Scojo-PECT preemptive scheduler

Scholarship at UWindsor

Explorations in Grid Workflow Scheduling

Author: Zheng Wei
Publication venue
Publication date: 01/08/2010
Field of study

The University of Manchester - Institutional Repository

Supercomputer Emulation For Evaluating Scheduling Algorithms

Author: Barberato Claudio
Publication venue
Publication date: 01/01/2017
Field of study

Scheduling algorithms have a significant impact on the optimal utilization of HPC facilities, yet the vast majority of the research in this area is done using simulations. In working with simulations, a great deal of factors that affect a real scheduler, such as its scheduling processing time, communication latencies and the scheduler intrinsic implementation complexity are not considered. As a result, despite theoretical improvements reported in several articles, practically no new algorithms proposed have been implemented in real schedulers, with HPC facilities still using the basic first-come-first-served (FCFS) with Backfill policy scheduling algorithm. A better approach could be, therefore, the use of real schedulers in an emulation environment to evaluate new algorithms. This thesis investigates two related challenges in emulations: computational cost and faithfulness of the results to real scheduling environments. It finds that the sampling, shrinking and shuffling of a trace must be done carefully to keep the classical metrics invariant or linear variant in relation to size and times of the original workload. This is accomplished by the careful control of the submission period and the consideration of drifts in the submission period and trace duration. This methodology can help researchers to better evaluate their scheduling algorithms and help HPC administrators to optimize the parameters of production schedulers. In order to assess the proposed methodology, we evaluated both the FCFS with Backfill and Suspend/Resume scheduling algorithms. The results strongly suggest that Suspend/Resume leads to a better utilization of a supercomputer when high priorities are given to big jobs

The Australian National University

Metascheduling of HPC Jobs in Day-Ahead Electricity Markets

Author: Murali Prakash
Vadhiyar Sathish
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/12/2017
Field of study

High performance grid computing is a key enabler of large scale collaborative computational science. With the promise of exascale computing, high performance grid systems are expected to incur electricity bills that grow super-linearly over time. In order to achieve cost effectiveness in these systems, it is essential for the scheduling algorithms to exploit electricity price variations, both in space and time, that are prevalent in the dynamic electricity price markets. In this paper, we present a metascheduling algorithm to optimize the placement of jobs in a compute grid which consumes electricity from the day-ahead wholesale market. We formulate the scheduling problem as a Minimum Cost Maximum Flow problem and leverage queue waiting time and electricity price predictions to accurately estimate the cost of job execution at a system. Using trace based simulation with real and synthetic workload traces, and real electricity price data sets, we demonstrate our approach on two currently operational grids, XSEDE and NorduGrid. Our experimental setup collectively constitute more than 433K processors spread across 58 compute systems in 17 geographically distributed locations. Experiments show that our approach simultaneously optimizes the total electricity cost and the average response time of the grid, without being unfair to users of the local batch systems.Comment: Appears in IEEE Transactions on Parallel and Distributed System

arXiv.org e-Print Archive

A new priority rule cloud scheduling technique that utilizes gaps to increase the efficiency of jobs distribution

Author: Saydul Akbar Murad
Publication venue
Publication date: 01/07/2023
Field of study

In recent years, the concept of cloud computing has been gaining traction to provide dynamically increasing access to shared computing resources (software and hardware) via the internet. It’s no secret that cloud computing’s ability to supply mission-critical services has made job scheduling a hot subject in the industry right now. However, the efficient utilization of these cloud resources has been a challenge, often resulting in wastage or degraded service performance due to poor scheduling. To solve this issue, existing research has been focused on queue-based job scheduling techniques, where jobs are scheduled based on specific deadlines or job lengths. To overcome this challenge, numerous researchers have focused on improving existing Priority Rule (PR) cloud schedulers by developing dynamic scheduling algorithms, but they have fallen short of meeting user satisfaction, such as flowtime, makespan, and total tardiness. These are the limitations of the current implementation of existing Priority Rule (PR) schedulers, mainly caused by blocking made by jobs at the head of the queue. These limitations lead to the poor performance of cloud-based mobile applications and other cloud services. To address this issue, the main objective of this research is to improve the existing PR cloud schedulers by developing a new dynamic scheduling algorithm by manipulating the gaps in the cloud job schedule. In this thesis, first a Priority-Based Fair Scheduling (PBFS) algorithm has been introduced to schedule jobs so that jobs get access to the required resources at optimal times. Then, a backfilling strategy called Shortest Gap Priority-Based Fair Scheduling (SG-PBFS) is proposed that attempts to manipulate the gaps in the schedule of cloud jobs. Finally, the performance evaluation demonstrates that the proposed SG-PBFS algorithm outperforms SG-SJF, SG-LJF, SG-FCFS, SG-EDF, and SG-(MAX-MIN) in terms of flow time, makespan time, and total tardiness, which conclusively demonstrates its effectiveness. The experiment result shows that for 500 jobs, SG-PBFS flow time, makespan time, and tardiness time are 9%, 4%, and 7% less than PBFS gradually

UMP Institutional Repository

Group-Based Parallel Multi-scheduling Methods for Grid Computing

Author: Abraham Goodhead Tomvie
Publication venue
Publication date: 01/01/2016
Field of study

Coventry University Pure Portal