42 research outputs found

    FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction

    Full text link
    Most cross-device federated learning (FL) studies focus on the model-homogeneous setting where the global server model and local client models are identical. However, such constraint not only excludes low-end clients who would otherwise make unique contributions to model training but also restrains clients from training large models due to on-device resource bottlenecks. In this work, we propose FedRolex, a partial training (PT)-based approach that enables model-heterogeneous FL and can train a global server model larger than the largest client model. At its core, FedRolex employs a rolling sub-model extraction scheme that allows different parts of the global server model to be evenly trained, which mitigates the client drift induced by the inconsistency between individual client models and server model architectures. We show that FedRolex outperforms state-of-the-art PT-based model-heterogeneous FL methods (e.g. Federated Dropout) and reduces the gap between model-heterogeneous and model-homogeneous FL, especially under the large-model large-dataset regime. In addition, we provide theoretical statistical analysis on its advantage over Federated Dropout and evaluate FedRolex on an emulated real-world device distribution to show that FedRolex can enhance the inclusiveness of FL and boost the performance of low-end devices that would otherwise not benefit from FL. Our code is available at: https://github.com/AIoT-MLSys-Lab/FedRolexComment: 20 pages, 7 Figures, Published in 36th Conference on Neural Information Processing And System

    NRC-Net: Automated noise robust cardio net for detecting valvular cardiac diseases using optimum transformation method with heart sound signals

    Full text link
    Cardiovascular diseases (CVDs) can be effectively treated when detected early, reducing mortality rates significantly. Traditionally, phonocardiogram (PCG) signals have been utilized for detecting cardiovascular disease due to their cost-effectiveness and simplicity. Nevertheless, various environmental and physiological noises frequently affect the PCG signals, compromising their essential distinctive characteristics. The prevalence of this issue in overcrowded and resource-constrained hospitals can compromise the accuracy of medical diagnoses. Therefore, this study aims to discover the optimal transformation method for detecting CVDs using noisy heart sound signals and propose a noise robust network to improve the CVDs classification performance.For the identification of the optimal transformation method for noisy heart sound data mel-frequency cepstral coefficients (MFCCs), short-time Fourier transform (STFT), constant-Q nonstationary Gabor transform (CQT) and continuous wavelet transform (CWT) has been used with VGG16. Furthermore, we propose a novel convolutional recurrent neural network (CRNN) architecture called noise robust cardio net (NRC-Net), which is a lightweight model to classify mitral regurgitation, aortic stenosis, mitral stenosis, mitral valve prolapse, and normal heart sounds using PCG signals contaminated with respiratory and random noises. An attention block is included to extract important temporal and spatial features from the noisy corrupted heart sound.The results of this study indicate that,CWT is the optimal transformation method for noisy heart sound signals. When evaluated on the GitHub heart sound dataset, CWT demonstrates an accuracy of 95.69% for VGG16, which is 1.95% better than the second-best CQT transformation technique. Moreover, our proposed NRC-Net with CWT obtained an accuracy of 97.4%, which is 1.71% higher than the VGG16

    GPT-FL: Generative Pre-trained Model-Assisted Federated Learning

    Full text link
    In this work, we propose GPT-FL, a generative pre-trained model-assisted federated learning (FL) framework. At its core, GPT-FL leverages generative pre-trained models to generate diversified synthetic data. These generated data are used to train a downstream model on the server, which is then fine-tuned with private client data under the standard FL framework. We show that GPT-FL consistently outperforms state-of-the-art FL methods in terms of model test accuracy, communication efficiency, and client sampling efficiency. Through comprehensive ablation analysis, we discover that the downstream model generated by synthetic data plays a crucial role in controlling the direction of gradient diversity during FL training, which enhances convergence speed and contributes to the notable accuracy boost observed with GPT-FL. Also, regardless of whether the target data falls within or outside the domain of the pre-trained generative model, GPT-FL consistently achieves significant performance gains, surpassing the results obtained by models trained solely with FL or synthetic data

    A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes

    Full text link
    Latin has historically led the state-of-the-art in handwritten optical character recognition (OCR) research. Adapting existing systems from Latin to alpha-syllabary languages is particularly challenging due to a sharp contrast between their orthographies. The segmentation of graphical constituents corresponding to characters becomes significantly hard due to a cursive writing system and frequent use of diacritics in the alpha-syllabary family of languages. We propose a labeling scheme based on graphemes (linguistic segments of word formation) that makes segmentation in-side alpha-syllabary words linear and present the first dataset of Bengali handwritten graphemes that are commonly used in an everyday context. The dataset contains 411k curated samples of 1295 unique commonly used Bengali graphemes. Additionally, the test set contains 900 uncommon Bengali graphemes for out of dictionary performance evaluation. The dataset is open-sourced as a part of a public Handwritten Grapheme Classification Challenge on Kaggle to benchmark vision algorithms for multi-target grapheme classification. The unique graphemes present in this dataset are selected based on commonality in the Google Bengali ASR corpus. From competition proceedings, we see that deep-learning methods can generalize to a large span of out of dictionary graphemes which are absent during training. Dataset and starter codes at www.kaggle.com/c/bengaliai-cv19.Comment: 15 pages, 12 figures, 6 Tables, Submitted to CVPR-2

    Ongoing efforts to improve the management of patients with diabetes in Bangladesh and the implications

    Get PDF
    Background: Prevalence rates of patients with diabetes are growing across countries, and Bangladesh is no exception. Associated costs are also increasing, driven by costs associated with the complications of diabetes including hypoglycaemia. Long-acting insulin analogues were developed to reduce hypoglycaemia as well as improve patient comfort and adherence. However, they have been appreciably more expensive reducing their affordability and use. Biosimilars offer a way forward. Consequently, there is a need to document current prescribing and dispensing rates for long-acting insulin analogues across Bangladesh, including current prices and differences, as a result of affordability and other issues. Methods: Mixed method approach including surveying prescribing practices in hospitals coupled with dispensing practices and prices among community pharmacies and drug stores across Bangladesh. This method was adopted since public hospitals only dispense insulins such as soluble insulins free-of-charge until funds run out and all long-acting insulin analogues have to be purchased from community stores. Results: There has been growing prescribing and dispensing of long-acting insulins in Bangladesh in recent years, accounting for over 80% of all insulins dispensed in a minority of stores. This has been helped by growing prescribing and dispensing of biosimilar insulin glargine at lower costs that the originator, with this trend likely to continue with envisaged growth in the number of patients. Consequently, Bangladesh can serve as an exemplar to other low- and middle-income countries struggling to fund long-acting insulins for their patients. Conclusions: It was encouraging to see continued growth in the prescribing and dispensing of long-acting insulin analogues in Bangladesh via the increasing availability of biosimilars. This is likely to continue benefitting all key stakeholder groups

    Rekurrenta neurala nÀtverk i prognostisering av elkonsumtion

    No full text
    In this thesis two main studies are conducted to compare the predictive capabilities of feed-forward neural networks (FFNN) and long short-term memory networks (LSTM) in electricity load forecasting. The first study compares univariate networks using past electricity load, as well as multivariate networks using past electricity load and air temperature, in day-ahead load forecasting using varying lookback periods and sparsity of past observations. The second study compares FFNNs and LSTMs of different complexities (i.e. network sizes) when restrictions imposed by limitations of the real world are taken into consideration. No significant differences are found between the predictive performances of the two neural network approaches. However, adding air temperature as extra input to the LSTM is found to significantly decrease its performance. Furthermore, the predictive performance of the FFNN is found to significantly decrease as the network complexity grows, while the predictive performance of the LSTM is found to increase as the network complexity grows. All the findings considered, we do not find that there is enough evidence in favour of the LSTM in electricity load forecasting.I denna uppsats beskrivs tvĂ„ studier som jĂ€mför feed-forward neurala nĂ€tverk (FFNN) och long short-term memory neurala nĂ€tverk (LSTM) i prognostisering av elkonsumtion. I den första studien undersöks univariata modeller som anvĂ€nder tidigare elkonsumtion, och flervariata modeller som anvĂ€nder tidigare elkonsumtion och temperaturmĂ€tningar, för att göra prognoser av elkonsumtion för nĂ€sta dag. Hur lĂ„ngt bak i tiden tidigare information hĂ€mtas ifrĂ„n samt upplösningen av tidigare information varieras. I den andra studien undersöks FFNN- och LSTM-modeller med praktiska begrĂ€nsningar sĂ„som tillgĂ€nglighet av data i Ă„tanke. Även storleken av nĂ€tverken varieras. I studierna finnes ingen skillnad mellan FFNN- och LSTM-modellernas förmĂ„ga att prognostisera elkonsumtion. DĂ€remot minskar FFNN-modellens förmĂ„ga att prognostisera elkonsumtion dĂ„ storleken av modellen ökar. Å andra sidan ökar LSTM-modellens förmĂ„ga dĂ„ storkelen ökar. UtifrĂ„n dessa resultat anser vi inte att det finns tillrĂ€ckligt med bevis till förmĂ„n för LSTM-modeller i prognostisering av elkonsumtion

    Rekurrenta neurala nÀtverk i prognostisering av elkonsumtion

    No full text
    In this thesis two main studies are conducted to compare the predictive capabilities of feed-forward neural networks (FFNN) and long short-term memory networks (LSTM) in electricity load forecasting. The first study compares univariate networks using past electricity load, as well as multivariate networks using past electricity load and air temperature, in day-ahead load forecasting using varying lookback periods and sparsity of past observations. The second study compares FFNNs and LSTMs of different complexities (i.e. network sizes) when restrictions imposed by limitations of the real world are taken into consideration. No significant differences are found between the predictive performances of the two neural network approaches. However, adding air temperature as extra input to the LSTM is found to significantly decrease its performance. Furthermore, the predictive performance of the FFNN is found to significantly decrease as the network complexity grows, while the predictive performance of the LSTM is found to increase as the network complexity grows. All the findings considered, we do not find that there is enough evidence in favour of the LSTM in electricity load forecasting.I denna uppsats beskrivs tvĂ„ studier som jĂ€mför feed-forward neurala nĂ€tverk (FFNN) och long short-term memory neurala nĂ€tverk (LSTM) i prognostisering av elkonsumtion. I den första studien undersöks univariata modeller som anvĂ€nder tidigare elkonsumtion, och flervariata modeller som anvĂ€nder tidigare elkonsumtion och temperaturmĂ€tningar, för att göra prognoser av elkonsumtion för nĂ€sta dag. Hur lĂ„ngt bak i tiden tidigare information hĂ€mtas ifrĂ„n samt upplösningen av tidigare information varieras. I den andra studien undersöks FFNN- och LSTM-modeller med praktiska begrĂ€nsningar sĂ„som tillgĂ€nglighet av data i Ă„tanke. Även storleken av nĂ€tverken varieras. I studierna finnes ingen skillnad mellan FFNN- och LSTM-modellernas förmĂ„ga att prognostisera elkonsumtion. DĂ€remot minskar FFNN-modellens förmĂ„ga att prognostisera elkonsumtion dĂ„ storleken av modellen ökar. Å andra sidan ökar LSTM-modellens förmĂ„ga dĂ„ storkelen ökar. UtifrĂ„n dessa resultat anser vi inte att det finns tillrĂ€ckligt med bevis till förmĂ„n för LSTM-modeller i prognostisering av elkonsumtion

    Rekurrenta neurala nÀtverk i prognostisering av elkonsumtion

    No full text
    In this thesis two main studies are conducted to compare the predictive capabilities of feed-forward neural networks (FFNN) and long short-term memory networks (LSTM) in electricity load forecasting. The first study compares univariate networks using past electricity load, as well as multivariate networks using past electricity load and air temperature, in day-ahead load forecasting using varying lookback periods and sparsity of past observations. The second study compares FFNNs and LSTMs of different complexities (i.e. network sizes) when restrictions imposed by limitations of the real world are taken into consideration. No significant differences are found between the predictive performances of the two neural network approaches. However, adding air temperature as extra input to the LSTM is found to significantly decrease its performance. Furthermore, the predictive performance of the FFNN is found to significantly decrease as the network complexity grows, while the predictive performance of the LSTM is found to increase as the network complexity grows. All the findings considered, we do not find that there is enough evidence in favour of the LSTM in electricity load forecasting.I denna uppsats beskrivs tvĂ„ studier som jĂ€mför feed-forward neurala nĂ€tverk (FFNN) och long short-term memory neurala nĂ€tverk (LSTM) i prognostisering av elkonsumtion. I den första studien undersöks univariata modeller som anvĂ€nder tidigare elkonsumtion, och flervariata modeller som anvĂ€nder tidigare elkonsumtion och temperaturmĂ€tningar, för att göra prognoser av elkonsumtion för nĂ€sta dag. Hur lĂ„ngt bak i tiden tidigare information hĂ€mtas ifrĂ„n samt upplösningen av tidigare information varieras. I den andra studien undersöks FFNN- och LSTM-modeller med praktiska begrĂ€nsningar sĂ„som tillgĂ€nglighet av data i Ă„tanke. Även storleken av nĂ€tverken varieras. I studierna finnes ingen skillnad mellan FFNN- och LSTM-modellernas förmĂ„ga att prognostisera elkonsumtion. DĂ€remot minskar FFNN-modellens förmĂ„ga att prognostisera elkonsumtion dĂ„ storleken av modellen ökar. Å andra sidan ökar LSTM-modellens förmĂ„ga dĂ„ storkelen ökar. UtifrĂ„n dessa resultat anser vi inte att det finns tillrĂ€ckligt med bevis till förmĂ„n för LSTM-modeller i prognostisering av elkonsumtion
    corecore