Federated Learning (FL) is a distributed Machine Learning (ML) technique that
can benefit from cloud environments while preserving data privacy. We propose
Multi-FedLS, a framework that manages multi-cloud resources, reducing execution
time and financial costs of Cross-Silo Federated Learning applications by using
preemptible VMs, cheaper than on-demand ones but that can be revoked at any
time. Our framework encloses four modules: Pre-Scheduling, Initial Mapping,
Fault Tolerance, and Dynamic Scheduler. This paper extends our previous work
\cite{brum2022sbac} by formally describing the Multi-FedLS resource manager
framework and its modules. Experiments were conducted with three Cross-Silo FL
applications on CloudLab and a proof-of-concept confirms that Multi-FedLS can
be executed on a multi-cloud composed by AWS and GCP, two commercial cloud
providers. Results show that the problem of executing Cross-Silo FL
applications in multi-cloud environments with preemptible VMs can be
efficiently resolved using a mathematical formulation, fault tolerance
techniques, and a simple heuristic to choose a new VM in case of revocation.Comment: In review by Journal of Parallel and Distributed Computin