42 research outputs found
FedRolex: Model-Heterogeneous Federated Learning with Rolling Sub-Model Extraction
Most cross-device federated learning (FL) studies focus on the
model-homogeneous setting where the global server model and local client models
are identical. However, such constraint not only excludes low-end clients who
would otherwise make unique contributions to model training but also restrains
clients from training large models due to on-device resource bottlenecks. In
this work, we propose FedRolex, a partial training (PT)-based approach that
enables model-heterogeneous FL and can train a global server model larger than
the largest client model. At its core, FedRolex employs a rolling sub-model
extraction scheme that allows different parts of the global server model to be
evenly trained, which mitigates the client drift induced by the inconsistency
between individual client models and server model architectures. We show that
FedRolex outperforms state-of-the-art PT-based model-heterogeneous FL methods
(e.g. Federated Dropout) and reduces the gap between model-heterogeneous and
model-homogeneous FL, especially under the large-model large-dataset regime. In
addition, we provide theoretical statistical analysis on its advantage over
Federated Dropout and evaluate FedRolex on an emulated real-world device
distribution to show that FedRolex can enhance the inclusiveness of FL and
boost the performance of low-end devices that would otherwise not benefit from
FL. Our code is available at: https://github.com/AIoT-MLSys-Lab/FedRolexComment: 20 pages, 7 Figures, Published in 36th Conference on Neural
Information Processing And System
NRC-Net: Automated noise robust cardio net for detecting valvular cardiac diseases using optimum transformation method with heart sound signals
Cardiovascular diseases (CVDs) can be effectively treated when detected
early, reducing mortality rates significantly. Traditionally, phonocardiogram
(PCG) signals have been utilized for detecting cardiovascular disease due to
their cost-effectiveness and simplicity. Nevertheless, various environmental
and physiological noises frequently affect the PCG signals, compromising their
essential distinctive characteristics. The prevalence of this issue in
overcrowded and resource-constrained hospitals can compromise the accuracy of
medical diagnoses. Therefore, this study aims to discover the optimal
transformation method for detecting CVDs using noisy heart sound signals and
propose a noise robust network to improve the CVDs classification
performance.For the identification of the optimal transformation method for
noisy heart sound data mel-frequency cepstral coefficients (MFCCs), short-time
Fourier transform (STFT), constant-Q nonstationary Gabor transform (CQT) and
continuous wavelet transform (CWT) has been used with VGG16. Furthermore, we
propose a novel convolutional recurrent neural network (CRNN) architecture
called noise robust cardio net (NRC-Net), which is a lightweight model to
classify mitral regurgitation, aortic stenosis, mitral stenosis, mitral valve
prolapse, and normal heart sounds using PCG signals contaminated with
respiratory and random noises. An attention block is included to extract
important temporal and spatial features from the noisy corrupted heart
sound.The results of this study indicate that,CWT is the optimal transformation
method for noisy heart sound signals. When evaluated on the GitHub heart sound
dataset, CWT demonstrates an accuracy of 95.69% for VGG16, which is 1.95%
better than the second-best CQT transformation technique. Moreover, our
proposed NRC-Net with CWT obtained an accuracy of 97.4%, which is 1.71% higher
than the VGG16
GPT-FL: Generative Pre-trained Model-Assisted Federated Learning
In this work, we propose GPT-FL, a generative pre-trained model-assisted
federated learning (FL) framework. At its core, GPT-FL leverages generative
pre-trained models to generate diversified synthetic data. These generated data
are used to train a downstream model on the server, which is then fine-tuned
with private client data under the standard FL framework. We show that GPT-FL
consistently outperforms state-of-the-art FL methods in terms of model test
accuracy, communication efficiency, and client sampling efficiency. Through
comprehensive ablation analysis, we discover that the downstream model
generated by synthetic data plays a crucial role in controlling the direction
of gradient diversity during FL training, which enhances convergence speed and
contributes to the notable accuracy boost observed with GPT-FL. Also,
regardless of whether the target data falls within or outside the domain of the
pre-trained generative model, GPT-FL consistently achieves significant
performance gains, surpassing the results obtained by models trained solely
with FL or synthetic data
A Large Multi-Target Dataset of Common Bengali Handwritten Graphemes
Latin has historically led the state-of-the-art in handwritten optical
character recognition (OCR) research. Adapting existing systems from Latin to
alpha-syllabary languages is particularly challenging due to a sharp contrast
between their orthographies. The segmentation of graphical constituents
corresponding to characters becomes significantly hard due to a cursive writing
system and frequent use of diacritics in the alpha-syllabary family of
languages. We propose a labeling scheme based on graphemes (linguistic segments
of word formation) that makes segmentation in-side alpha-syllabary words linear
and present the first dataset of Bengali handwritten graphemes that are
commonly used in an everyday context. The dataset contains 411k curated samples
of 1295 unique commonly used Bengali graphemes. Additionally, the test set
contains 900 uncommon Bengali graphemes for out of dictionary performance
evaluation. The dataset is open-sourced as a part of a public Handwritten
Grapheme Classification Challenge on Kaggle to benchmark vision algorithms for
multi-target grapheme classification. The unique graphemes present in this
dataset are selected based on commonality in the Google Bengali ASR corpus.
From competition proceedings, we see that deep-learning methods can generalize
to a large span of out of dictionary graphemes which are absent during
training. Dataset and starter codes at www.kaggle.com/c/bengaliai-cv19.Comment: 15 pages, 12 figures, 6 Tables, Submitted to CVPR-2
Ongoing efforts to improve the management of patients with diabetes in Bangladesh and the implications
Background: Prevalence rates of patients with diabetes are growing across countries, and Bangladesh is no exception. Associated costs are also increasing, driven by costs associated with the complications of diabetes including hypoglycaemia. Long-acting insulin analogues were developed to reduce hypoglycaemia as well as improve patient comfort and adherence. However, they have been appreciably more expensive reducing their affordability and use. Biosimilars offer a way forward. Consequently, there is a need to document current prescribing and dispensing rates for long-acting insulin analogues across Bangladesh, including current prices and differences, as a result of affordability and other issues. Methods: Mixed method approach including surveying prescribing practices in hospitals coupled with dispensing practices and prices among community pharmacies and drug stores across Bangladesh. This method was adopted since public hospitals only dispense insulins such as soluble insulins free-of-charge until funds run out and all long-acting insulin analogues have to be purchased from community stores. Results: There has been growing prescribing and dispensing of long-acting insulins in Bangladesh in recent years, accounting for over 80% of all insulins dispensed in a minority of stores. This has been helped by growing prescribing and dispensing of biosimilar insulin glargine at lower costs that the originator, with this trend likely to continue with envisaged growth in the number of patients. Consequently, Bangladesh can serve as an exemplar to other low- and middle-income countries struggling to fund long-acting insulins for their patients. Conclusions: It was encouraging to see continued growth in the prescribing and dispensing of long-acting insulin analogues in Bangladesh via the increasing availability of biosimilars. This is likely to continue benefitting all key stakeholder groups
Rekurrenta neurala nÀtverk i prognostisering av elkonsumtion
In this thesis two main studies are conducted to compare the predictive capabilities of feed-forward neural networks (FFNN) and long short-term memory networks (LSTM) in electricity load forecasting. The first study compares univariate networks using past electricity load, as well as multivariate networks using past electricity load and air temperature, in day-ahead load forecasting using varying lookback periods and sparsity of past observations. The second study compares FFNNs and LSTMs of different complexities (i.e. network sizes) when restrictions imposed by limitations of the real world are taken into consideration. No significant differences are found between the predictive performances of the two neural network approaches. However, adding air temperature as extra input to the LSTM is found to significantly decrease its performance. Furthermore, the predictive performance of the FFNN is found to significantly decrease as the network complexity grows, while the predictive performance of the LSTM is found to increase as the network complexity grows. All the findings considered, we do not find that there is enough evidence in favour of the LSTM in electricity load forecasting.I denna uppsats beskrivs tvĂ„ studier som jĂ€mför feed-forward neurala nĂ€tverk (FFNN) och long short-term memory neurala nĂ€tverk (LSTM) i prognostisering av elkonsumtion. I den första studien undersöks univariata modeller som anvĂ€nder tidigare elkonsumtion, och flervariata modeller som anvĂ€nder tidigare elkonsumtion och temperaturmĂ€tningar, för att göra prognoser av elkonsumtion för nĂ€sta dag. Hur lĂ„ngt bak i tiden tidigare information hĂ€mtas ifrĂ„n samt upplösningen av tidigare information varieras. I den andra studien undersöks FFNN- och LSTM-modeller med praktiska begrĂ€nsningar sĂ„som tillgĂ€nglighet av data i Ă„tanke. Ăven storleken av nĂ€tverken varieras. I studierna finnes ingen skillnad mellan FFNN- och LSTM-modellernas förmĂ„ga att prognostisera elkonsumtion. DĂ€remot minskar FFNN-modellens förmĂ„ga att prognostisera elkonsumtion dĂ„ storleken av modellen ökar. Ă
andra sidan ökar LSTM-modellens förmÄga dÄ storkelen ökar. UtifrÄn dessa resultat anser vi inte att det finns tillrÀckligt med bevis till förmÄn för LSTM-modeller i prognostisering av elkonsumtion
Rekurrenta neurala nÀtverk i prognostisering av elkonsumtion
In this thesis two main studies are conducted to compare the predictive capabilities of feed-forward neural networks (FFNN) and long short-term memory networks (LSTM) in electricity load forecasting. The first study compares univariate networks using past electricity load, as well as multivariate networks using past electricity load and air temperature, in day-ahead load forecasting using varying lookback periods and sparsity of past observations. The second study compares FFNNs and LSTMs of different complexities (i.e. network sizes) when restrictions imposed by limitations of the real world are taken into consideration. No significant differences are found between the predictive performances of the two neural network approaches. However, adding air temperature as extra input to the LSTM is found to significantly decrease its performance. Furthermore, the predictive performance of the FFNN is found to significantly decrease as the network complexity grows, while the predictive performance of the LSTM is found to increase as the network complexity grows. All the findings considered, we do not find that there is enough evidence in favour of the LSTM in electricity load forecasting.I denna uppsats beskrivs tvĂ„ studier som jĂ€mför feed-forward neurala nĂ€tverk (FFNN) och long short-term memory neurala nĂ€tverk (LSTM) i prognostisering av elkonsumtion. I den första studien undersöks univariata modeller som anvĂ€nder tidigare elkonsumtion, och flervariata modeller som anvĂ€nder tidigare elkonsumtion och temperaturmĂ€tningar, för att göra prognoser av elkonsumtion för nĂ€sta dag. Hur lĂ„ngt bak i tiden tidigare information hĂ€mtas ifrĂ„n samt upplösningen av tidigare information varieras. I den andra studien undersöks FFNN- och LSTM-modeller med praktiska begrĂ€nsningar sĂ„som tillgĂ€nglighet av data i Ă„tanke. Ăven storleken av nĂ€tverken varieras. I studierna finnes ingen skillnad mellan FFNN- och LSTM-modellernas förmĂ„ga att prognostisera elkonsumtion. DĂ€remot minskar FFNN-modellens förmĂ„ga att prognostisera elkonsumtion dĂ„ storleken av modellen ökar. Ă
andra sidan ökar LSTM-modellens förmÄga dÄ storkelen ökar. UtifrÄn dessa resultat anser vi inte att det finns tillrÀckligt med bevis till förmÄn för LSTM-modeller i prognostisering av elkonsumtion
Rekurrenta neurala nÀtverk i prognostisering av elkonsumtion
In this thesis two main studies are conducted to compare the predictive capabilities of feed-forward neural networks (FFNN) and long short-term memory networks (LSTM) in electricity load forecasting. The first study compares univariate networks using past electricity load, as well as multivariate networks using past electricity load and air temperature, in day-ahead load forecasting using varying lookback periods and sparsity of past observations. The second study compares FFNNs and LSTMs of different complexities (i.e. network sizes) when restrictions imposed by limitations of the real world are taken into consideration. No significant differences are found between the predictive performances of the two neural network approaches. However, adding air temperature as extra input to the LSTM is found to significantly decrease its performance. Furthermore, the predictive performance of the FFNN is found to significantly decrease as the network complexity grows, while the predictive performance of the LSTM is found to increase as the network complexity grows. All the findings considered, we do not find that there is enough evidence in favour of the LSTM in electricity load forecasting.I denna uppsats beskrivs tvĂ„ studier som jĂ€mför feed-forward neurala nĂ€tverk (FFNN) och long short-term memory neurala nĂ€tverk (LSTM) i prognostisering av elkonsumtion. I den första studien undersöks univariata modeller som anvĂ€nder tidigare elkonsumtion, och flervariata modeller som anvĂ€nder tidigare elkonsumtion och temperaturmĂ€tningar, för att göra prognoser av elkonsumtion för nĂ€sta dag. Hur lĂ„ngt bak i tiden tidigare information hĂ€mtas ifrĂ„n samt upplösningen av tidigare information varieras. I den andra studien undersöks FFNN- och LSTM-modeller med praktiska begrĂ€nsningar sĂ„som tillgĂ€nglighet av data i Ă„tanke. Ăven storleken av nĂ€tverken varieras. I studierna finnes ingen skillnad mellan FFNN- och LSTM-modellernas förmĂ„ga att prognostisera elkonsumtion. DĂ€remot minskar FFNN-modellens förmĂ„ga att prognostisera elkonsumtion dĂ„ storleken av modellen ökar. Ă
andra sidan ökar LSTM-modellens förmÄga dÄ storkelen ökar. UtifrÄn dessa resultat anser vi inte att det finns tillrÀckligt med bevis till förmÄn för LSTM-modeller i prognostisering av elkonsumtion