5 research outputs found

    Speeding up Heterogeneous Federated Learning with Sequentially Trained Superclients

    Get PDF
    Federated Learning (FL) allows training machine learning models in privacy-constrained scenarios by enabling the cooperation of edge devices without requiring local data sharing. This approach raises several challenges due to the different statistical distribution of the local datasets and the clients' computational heterogeneity. In particular, the presence of highly non-i.i.d. data severely impairs both the performance of the trained neural network and its convergence rate, increasing the number of communication rounds requested to reach a performance comparable to that of the centralized scenario. As a solution, we propose FedSeq, a novel framework leveraging the sequential training of subgroups of heterogeneous clients, i.e. superclients, to emulate the centralized paradigm in a privacy-compliant way. Given a fixed budget of communication rounds, we show that FedSeq outperforms or match several state-of-the-art federated algorithms in terms of final performance and speed of convergence. Finally, our method can be easily integrated with other approaches available in the literature. Empirical results show that combining existing algorithms with FedSeq further improves its final performance and convergence speed. We test our method on CIFAR-10 and CIFAR-100 and prove its effectiveness in both i.i.d. and non-i.i.d. scenarios.Comment: Published at the 26th International Conference on Pattern Recognition (ICPR), 2022, pp. 3376-338

    ESTSS—energy system time series suite: a declustered, application-independent, semi-artificial load profile benchmark set

    Get PDF
    This paper introduces an univariate application-independent set of load profiles or time series derived from real-world energy system data. The generation involved a two-step process: manifolding the initial dataset through signal processors to increase diversity and heterogeneity, followed by a declustering process that removes data redundancy. The study employed common feature engineering and machine learning techniques: the time series are transformed into a normalized feature space, followed by a dimensionality reduction via hierarchical clustering, and optimization. The resulting dataset is uniformly distributed across multiple feature space dimensions while retaining typical time and frequency domain characteristics inherent in energy system time series. This data serves various purposes, including algorithm testing, uncovering functional relationships between time series features and system performance, and training machine learning models. Two case studies demonstrate the claims: one focused on the suitability of hybrid energy storage systems and the other on quantifying the onsite hydrogen supply cost in green hydrogen production sites. The declustering algorithm, although a bys study, shows promise for further scientific exploration. The data and source code are openly accessible, providing a robust platform for future comparative studies. This work also offers smaller subsets for computationally intensive research. Data and source code can be found at https://github.com/s-guenther/estss and https://zenodo.org/records/10213145
    corecore