Search CORE

2 research outputs found

Game theoretic analysis of the slurm scheduler model

Author: Uruchi Ticona Wilmer
Publication venue: Barcelona Supercomputing Center
Publication date: 01/05/2020
Field of study

In the context of High Performance Computing, scheduling is a necessary tool to ensure that there exists acceptable quality of service for the many users of the processing power available. The scheduling process can vary from a simple First Comes First Served model to a wide variety of more complex implementations that tend to satisfy specific requirements from each group of users. Slurm is an open source, faulttolerant, and highly scalable cluster management system for large and small Linux clusters [1]. MareNostrum 4, a High Performance Computer, implements it to manage the execution of jobs send to it by a variety of users [2]. Previous work has been done from an algorithmic approach that attempts at directly reduce queuing times among other costs [3][4]. We consider that there is utility at looking at the problem also from a Game Theoretic perspective to define clearly the mechanics involved in the system, and also those that define the influx of tasks that the scheduler manages. We model the Slurm scheduling mechanism using Game Theoretic concepts, tools, and reasonable simplifications in an attempt to formally characterize and study it. We identify variables that play a significant role in the scheduling process and also experiment with changes in the model that could make users behave in a way that would improve overall quality of service. We recognize that the complexity of the models might derive in difficulty to theoretically analyze them, so we make use of usage data derived from real usage from BSC-CNS users to measure performance. The real usage data is extracted from Autosubmit [5], a workflow manager developed at the Earth Science Department at BSC-CNS. This is a convenient choice, given that we also attempt to measure the influence of an external agent (e.g. a workflow manager) could have in the overall quality of service if it imposes restrictions, and the nature of these restrictions

UPCommons. Portal del coneixement obert de la UPC

The High Perfomance Scheduler Game: A Characterization of Slurm, Metrics, and the Viability of Cooperation

Author: Uruchi Ticona Wilmer Vidal
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/07/2020
Field of study

The Slurm Scheduler is a widely used tool for scheduling in High Per- formance Computing platforms around the world. Several studies have been conducted to nd ways to improve speci c performance metrics, mainly from an algorithmic perspective. Scheduling has also been stud- ied from the viewpoint of Game Theory, where models that attempt to capture the main characteristics of the problem are developed and an- alyzed. In this study, we have used the tools that Algorithmic Game Theory provides to develop and study a model that captures some of the main characteristics of the Slurm Scheduler. We developed the necessary software to test these models. We performed a thorough data analysis pro- cess to build a reliable data source based on real usage information. Then, through experimentation, we analyzed how our model and its variants be- have; furthermore, we compared these results with the results from an existing Slurm Simulator, developed by Barcelona Supercomputing Cen- ter members. Using these results, we calculated an approximate value for the Price of Anarchy, and we discuss the Viability of Cooperation in the context of the Slurm Scheduler

UPCommons. Portal del coneixement obert de la UPC