Search CORE

17 research outputs found

Multiple-queue backfilling scheduling with priorities and reservations for parallel systems

Author: Barry G. Lawson
Bode B.
Evgenia Smirni
Feitelson D.G.
Lawson B. G.
Perkovic D.
Talby D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Dynamic Routing Algorithms and Methods for Controlling Traffic Flows of Cloud Applications and Services

Author
Publication venue: 'FSAEIHE South Ural State University (National Research University)'
Publication date: 08/06/2017
Field of study

Nowadays, we see a steady growth in the use of cloud computing in modern business. This enables to reduce the cost of IT infrastructure owning and operation; however, there are some issues related to the management of data processing centers.One of these issues is the effective use of companies’ computing and network resources. The goal of optimization is to manage the traffic in cloud applications and services within data centers.Taking into account the multitier architecture of modern data centers, we need to pay a special attention to this task. The advantage of modern infrastructure virtualization is the possibility to use software-defined networks and software-defined data storages. However, the existing optimization of algorithmic solutions does not take into account the specific features of the network traffic formation with multiple application types.The task of optimizing traffic distribution for cloud applications and services can be solved by using software-defined infrastructure of virtual data centers.We have developed a simulation model for the traffic in software-defined networks segments of data centers involved in processing user requests to cloud application and services within a network environment.Our model enables to implement the traffic management algorithm of cloud applications and to optimize the access to storage systems through the effective use of data transmission channels. During the experimental studies, we have found that the use of our algorithm enables to decrease the response time of cloud applications and services and, therefore, to increase the productivity of user requests processing and to reduce the number of refusals

Вестник Южно-Уральского государственного университета

Self-Adaptive Scheduler Parameterization

Author: Lawson Barry
Smirni Evgenia
Publication venue: UR Scholarship Repository
Publication date: 01/01/2005
Field of study

High-end parallel systems present a tremendous research challenge on how to best allocate their resources to match dynamic workload characteristics and user habits that are often unique to each system. Although thoroughly investigated, job scheduling for production systems remains an inexact science, requiring significant experience and intuition from system administrators to properly configure batch schedulers. State-of-the-art schedulers provide many parameters for their configuration, but tuning these to optimize performance and to appropriately respond to the continuously varying characteristics of the workloads can be very difficult — the effects of different parameters and their interactions are often unintuitive. In this paper, we introduce a new and general methodology for automating the difficult process of job scheduler parameterization. Our proposed methodology is based on online simulations of a model of the actual system to provide on-the-fly suggestions to the scheduler for automated parameter adjustment. Detailed performance comparisons via simulation using actual supercomputing traces from the Parallel Workloads Archive indicate that this self-adaptive parameterization via online simulation consistently outperforms other workload-aware methods for scheduler parameterization. This methodology is unique, flexible, and practical in that it requires no a priori knowledge of the workload, it works well even in the presence of poor user runtime estimates, and it can be used to address any system statistic of interest

CiteSeerX

University of Richmond

Adapting Batch Scheduling to Workload Characteristics: What can we expect From Online Learning?

Author: Legrand Arnaud
Trystram Denis
Zrigui Salah
Publication venue: HAL CCSD
Publication date: 15/10/2018
Field of study

Despite the impressive growth and size of super-computers, the computational power they provide still cannot match the demand. Efficient and fair resource allocation is a critical task. Super-computers use Resource and Job Management Systems to schedule applications, which is generally done by relying on generic index policies such as First Come First Served and Shortest Processing time First in combination with Backfilling strategies. Unfortunately, such generic policies often fail to exploit specific characteristics of real workloads. In this work, we focus on improving the performance of online schedulers. We study mixed policies, which are created by combining multiple job characteristics in a weighted linear expression, as opposed to classical pure policies which use only a single characteristic. This larger class of scheduling policies aims at providing more flexibility and adaptability. We use space coverage and black-box optimization techniques to explore this new space of mixed policies and we study how can they adapt to the changes in the workload. We perform an extensive experimental campaign through which we show that (1) even the best pure policy is far from optimal and that (2) using a carefully tuned mixed policy would allow to significantly improve the performance of the system. (3) We also provide empirical evidence that there is no one size fits all policy, by showing that the rapid workload evolution seems to prevent classical online learning algorithms from being effective.Malgré la croissance impressionnante et la taille des super-ordinateurs, le la puissance de calcul qu’ils fournissent ne peut toujours pas correspondre à la demande. Une allocation efficace et juste des ressources est essentielle tâche. Les super-ordinateurs utilisent des systèmes de gestion des ressources et des tâches pour programmer les applications, ce qui est généralement fait en s?appuyant sur des politiques d’index telles que First Come First Served et Shortest Temps de traitement D’abord en combinaison avec les stratégies de remblayage. Malheureusement, ces politiques génériques échouent souvent exploiter les caractéristiques spécifiques des charges de travail réelles. Dans ce travail, nous nous concentrons sur l’amélioration des performances des ordonnanceurs en ligne. Nous étudions des stratégies mixtes, créées en combinant plusieurs tâches caractéristiques dans une expression linéaire pondérée, par opposition à les politiques pures classiques qui n’utilisent qu’une seule caractéristique. Ce une plus grande classe de politiques de planification vise à offrir plus de flexibilité et adaptabilité. Nous utilisons la couverture d’espace et l’optimisation de la boîtenoire techniques pour explorer ce nouvel espace de politiques mixtes et nous étudions Comment peuvent-ils s’adapter aux changements de la charge de travail? Nous réalisons une vaste campagne expérimentale à travers laquelle nous montrons que (1) même la meilleure politique pure est loin d?être optimale et que (2) l?utilisation d?une politique mixte soigneusement adaptée permettrait de améliorer de manière significative les performances du système. (3) nous aussi fournir des preuves empiriques qu’il n’y a pas de politique uniforme, en montrant que l’évolution rapide de la charge de travail semble empêcher algorithmes classiques d’apprentissage en ligne d’être efficaces

INRIA a CCSD electronic archive server

LOMARC: Look ahead matchmaking for multi-resource coscheduling.

Author: Lan Lei
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2004
Field of study

Hyper-Threading (HT) provides a new possibility for job coscheduling without context switch and without the cost for coordinating processes of one parallel job. However, HT achieves high processor throughput at the expense of reducing the performance of the individual process. Since the hardware resources are actually shared between two coscheduled jobs, the resource contention will harm the performance of each job. Most scheduling approaches only focus on the CPU without considering the impact on other resources. In this thesis we present LOMARC, a space-time sharing approach that takes multiple resources, including CPU, I/O, memory and network, into consideration for job coscheduling on HT processors. To improve resource utilization and reduce job response times, LOMARC matches two jobs with complementary resource requirements to coschedule. Our approach partially reorders the waiting job queue by lookahead to increase the possibility of finding a good match. LOMARC also generalizes for standard CPUs, using an adjusted matching scheme and only focusing on hiding I/O latency. In addition, LOMARC incorporates standard scheduling approaches such as priority ordering, aging and backfilling. In our simulation experiment, we use a realistic workload model to provide the convincing results. Our experimental results demonstrate that LOMARC delivers better performance than the standard space sharing approach and the other two job coscheduling approaches for HT processors. The performance gain is mainly due to an increased possibility of coscheduling two complementary jobs by looking ahead on the waiting queue. Source: Masters Abstracts International, Volume: 43-01, page: 0239. Adviser: Angela Sodan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

Scholarship at UWindsor

Simulation techniques in an artificial society model

Author: Lawson Barry Glenn
Publication venue: W&M ScholarWorks
Publication date: 01/01/2002
Field of study

Artificial society refers to a generic class of agent-based simulation models used to discover global social structures and collective behavior produced by simple local rules and interaction mechanisms. Artificial society models are applicable in a variety of disciplines, including the modeling of chemical and biological processes, natural phenomena, and complex adaptive systems. We focus on the underlying simulation techniques used in artificial society discrete-event simulation models, including model time evolution and computational performance.;Although for some applications synchronous time evolution is the correct modeling approach, many other applications are better represented using asynchronous time evolution. We claim that asynchronous time evolution can eliminate potential simulation artifacts produced using synchronous time evolution. Using an adaptation of a popular artificial society model, we show that very different output can result based solely on the choice of asynchronous or synchronous time evolution. Based on the event list implementation chosen, the use of discrete-event simulation to incorporate asynchronous time evolution can incur a substantial loss in computational performance. Accordingly, we evaluate select event list implementations within the artificial society simulation model and demonstrate that acceptable performance can be achieved.;In addition to the artificial society model, we show that transforming from a synchronous to an asynchronous system proves beneficial for scheduling resources in a parallel system. We focus on non-FCFS job scheduling policies that permit jobs to backfill, i.e., to move ahead in the queue, given that they do not delay certain previously submitted jobs. Instead of using a single queue of jobs, we propose a simple yet effective backfilling scheduling policy that effectively separates short from long jobs by incorporating multiple queues. By monitoring system performance, our policy adapts its configuration parameters in response to severe changes in the job arrival pattern and/or resource demands. Detailed performance comparisons via simulation using actual parallel workload traces indicate that our proposed policy consistently outperforms traditional backfilling in a variety of contexts

College of William & Mary: W&M Publish

Tuning EASY-Backfilling Queues

Author: Lelong Jérôme
Reis Valentin
Trystram Denis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/05/2017
Field of study

International audienceEASY-Backfilling is a popular scheduling heuristic for allocating jobs in large scale High Performance Computing platforms. While its aggressive reservation mechanism is fast and prevents job starvation, it does not try to optimize any scheduling objective per se. We consider in this work the problem of tuning EASY using queue reordering policies. More precisely, we propose to tune the reordering using a simulation-based methodology. For a given system, we choose the policy in order to minimize the average waiting time. This methodology departs from the First-Come, First-Serve rule and introduces a risk on the maximum values of the waiting time, which we control using a queue thresholding mechanism. This new approach is evaluated through a comprehensive experimental campaign on five production logs. In particular, we show that the behavior of the systems under study is stable enough to learn a heuristic that generalizes in a train/test fashion. Indeed, the average waiting time can be reduced consistently (between 11% to 42% for the logs used) compared to EASY, with almost no increase in maximum waiting times. This work departs from previous learning-based approaches and shows that scheduling heuristics for HPC can be learned directly in a policy space

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

ADEPT Runtime/Scalability Predictor in support of Adaptive Scheduling

Author: Deshmeh Gholamhossein
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2013
Field of study

A job scheduler determines the order and duration of the allocation of resources, e.g. CPU, to the tasks waiting to run on a computer. Round-Robin and First-Come-First-Serve are examples of algorithms for making such resource allocation decisions. Parallel job schedulers make resource allocation decisions for applications that need multiple CPU cores, on computers consisting of many CPU cores connected by different interconnects. An adaptive parallel scheduler is a parallel scheduler that is capable of adjusting its resource allocation decisions based on the current resource usage and demand. Adaptive parallel schedulers that decide the numbers of CPU cores to allocate to a parallel job provide more flexibility and potentially improve performance significantly for both local and grid job scheduling compared to non-adaptive schedulers. A major reason why adaptive schedulers are not yet used practically is due to lack of knowledge of the scalability curves of the applications, and high cost of existing white-box approaches for scalability prediction. We show that a runtime and scalability prediction tool can be developed with 3 requirements: accuracy comparable to white-box methods, applicability, and robustness. Applicability depends only on knowledge feasible to gain in a production environment. Robustness addresses anomalous behaviour and unreliable predictions. We present ADEPT, a speedup and runtime prediction tool that satisfies all criteria for both single problem size and across different problem sizes of a parallel application. ADEPT is also capable of handling anomalies and judging reliability of its predictions. We demonstrate these using experiments with MPI and OpenMP implementations of NAS benchmarks and seven real applications

Scholarship at UWindsor

Dagstuhl News January - December 1999

Author: Wilhelm Reinhard
Publication venue: Dagstuhl Publications. Dagstuhl News
Publication date: 01/01/1998
Field of study

"Dagstuhl News" is a publication edited especially for the members of the Foundation "Informatikzentrum Schloss Dagstuhl" to thank them for their support. The News give a summary of the scientific work being done in Dagstuhl. Each Dagstuhl Seminar is presented by a small abstract describing the contents and scientific highlights of the seminar as well as the perspectives or challenges of the research topic

Dagstuhl Research Online Publication Server