Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres

Fernández Cerero, Damián; Fernández Montes González, Alejandro; Jakóbik, Agnieszka; Troyano Jiménez, José Antonio

Machine learning regression to boost scheduling performance in hyper-scale cloud-computing data centres

Authors: Damián Fernández Cerero
Alejandro Fernández Montes González
Agnieszka Jakóbik
José Antonio Troyano Jiménez
Publication date: 1 January 2022
Publisher: 'Elsevier BV'

Abstract

Data centres increase their size and complexity due to the increasing amount of heterogeneous work loads and patterns to be served. Such a mix of various purpose workloads makes the optimisation of resource management systems according to temporal or application-level patterns difficult. Data centre operators have developed multiple resource-management models to improve scheduling perfor mance in controlled scenarios. However, the constant evolution of the workloads makes the utilisation of only one resource-management model sub-optimal in some scenarios. In this work, we propose: (a) a machine learning regression model based on gradient boosting to pre dict the time a resource manager needs to schedule incoming jobs for a given period; and (b) a resource management model, Boost, that takes advantage of this regression model to predict the scheduling time of a catalogue of resource managers so that the most performant can be used for a time span. The benefits of the proposed resource-management model are analysed by comparing its scheduling performance KPIs to those provided by the two most popular resource-management models: two level, used by Apache Mesos, and shared-state, employed by Google Borg. Such gains are empirically eval uated by simulating a hyper-scale data centre that executes a realistic synthetically generated workload that follows real-world trace patternsMinisterio de Ciencia e Innovación RTI2018-098062-A-I0

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

idUS. Depósito de Investigación Universidad de Sevilla

oai:idus.us.es:11441/134946

Last time updated on 01/04/2023