Fault-tolerant parallel scheduling of arbitrary length jobs on a shared channel

Kowalski, Dariusz R; Mirek, JD; Wong, Prudence WH

research

Fault-tolerant parallel scheduling of arbitrary length jobs on a shared channel

Authors: Dariusz R Kowalski
JD Mirek
Prudence WH Wong
Publication date: 10 July 2019
Publisher: Springer Nature

Abstract

We study the problem of scheduling jobs on fault-prone machines communicating via a shared channel, also known as multiple-access channel. We have

n

arbitrary length jobs to be scheduled on

m

identical machines,

f

of which are prone to crashes by an adversary. A machine can inform other machines when a job is completed via the channel without collision detection. Performance is measured by the total number of available machine steps during the whole execution. Our goal is to study the impact of preemption (i.e., interrupting the execution of a job and resuming later in the same or different machine) and failures on the work performance of job processing. The novelty is the ability to identify the features that determine the complexity (difficulty) of the problem. We show that the problem becomes difficult when preemption is not allowed, by showing corresponding lower and upper bounds, the latter with algorithms reaching them. We also prove that randomization helps even more, but only against a non-adaptive adversary; in the presence of more severe adaptive adversary, randomization does not help in any setting. Our work has extended from previous work that focused on settings including: scheduling on multiple-access channel without machine failures, complete information about failures, or incomplete information about failures (like in this work) but with unit length jobs and, hence, without considering preemption

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

University of Liverpool Repository

oai:livrepository.liverpool.ac...

Last time updated on 22/11/2017