399 research outputs found
MAS: Towards Resource-Efficient Federated Multiple-Task Learning
Federated learning (FL) is an emerging distributed machine learning method
that empowers in-situ model training on decentralized edge devices. However,
multiple simultaneous FL tasks could overload resource-constrained devices. In
this work, we propose the first FL system to effectively coordinate and train
multiple simultaneous FL tasks. We first formalize the problem of training
simultaneous FL tasks. Then, we present our new approach, MAS (Merge and
Split), to optimize the performance of training multiple simultaneous FL tasks.
MAS starts by merging FL tasks into an all-in-one FL task with a multi-task
architecture. After training for a few rounds, MAS splits the all-in-one FL
task into two or more FL tasks by using the affinities among tasks measured
during the all-in-one training. It then continues training each split of FL
tasks based on model parameters from the all-in-one training. Extensive
experiments demonstrate that MAS outperforms other methods while reducing
training time by 2x and reducing energy consumption by 40%. We hope this work
will inspire the community to further study and optimize training simultaneous
FL tasks.Comment: ICCV'23. arXiv admin note: substantial text overlap with
arXiv:2207.0420
FLuID: Mitigating Stragglers in Federated Learning using Invariant Dropout
Federated Learning (FL) allows machine learning models to train locally on
individual mobile devices, synchronizing model updates via a shared server.
This approach safeguards user privacy; however, it also generates a
heterogeneous training environment due to the varying performance capabilities
across devices. As a result, straggler devices with lower performance often
dictate the overall training time in FL. In this work, we aim to alleviate this
performance bottleneck due to stragglers by dynamically balancing the training
load across the system. We introduce Invariant Dropout, a method that extracts
a sub-model based on the weight update threshold, thereby minimizing potential
impacts on accuracy. Building on this dropout technique, we develop an adaptive
training framework, Federated Learning using Invariant Dropout (FLuID). FLuID
offers a lightweight sub-model extraction to regulate computational intensity,
thereby reducing the load on straggler devices without affecting model quality.
Our method leverages neuron updates from non-straggler devices to construct a
tailored sub-model for each straggler based on client performance profiling.
Furthermore, FLuID can dynamically adapt to changes in stragglers as runtime
conditions shift. We evaluate FLuID using five real-world mobile clients. The
evaluations show that Invariant Dropout maintains baseline model efficiency
while alleviating the performance bottleneck of stragglers through a dynamic,
runtime approach
- …