634 research outputs found
Ensemble deep learning: A review
Ensemble learning combines several individual models to obtain better
generalization performance. Currently, deep learning models with multilayer
processing architecture is showing better performance as compared to the
shallow or traditional classification models. Deep ensemble learning models
combine the advantages of both the deep learning models as well as the ensemble
learning such that the final model has better generalization performance. This
paper reviews the state-of-art deep ensemble models and hence serves as an
extensive summary for the researchers. The ensemble models are broadly
categorised into ensemble models like bagging, boosting and stacking, negative
correlation based deep ensemble models, explicit/implicit ensembles,
homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised,
semi-supervised, reinforcement learning and online/incremental, multilabel
based deep ensemble models. Application of deep ensemble models in different
domains is also briefly discussed. Finally, we conclude this paper with some
future recommendations and research directions
PMU measurements based short-term voltage stability assessment of power systems via deep transfer learning
Deep learning has emerged as an effective solution for addressing the
challenges of short-term voltage stability assessment (STVSA) in power systems.
However, existing deep learning-based STVSA approaches face limitations in
adapting to topological changes, sample labeling, and handling small datasets.
To overcome these challenges, this paper proposes a novel phasor measurement
unit (PMU) measurements-based STVSA method by using deep transfer learning. The
method leverages the real-time dynamic information captured by PMUs to create
an initial dataset. It employs temporal ensembling for sample labeling and
utilizes least squares generative adversarial networks (LSGAN) for data
augmentation, enabling effective deep learning on small-scale datasets.
Additionally, the method enhances adaptability to topological changes by
exploring connections between different faults. Experimental results on the
IEEE 39-bus test system demonstrate that the proposed method improves model
evaluation accuracy by approximately 20% through transfer learning, exhibiting
strong adaptability to topological changes. Leveraging the self-attention
mechanism of the Transformer model, this approach offers significant advantages
over shallow learning methods and other deep learning-based approaches.Comment: Accepted by IEEE Transactions on Instrumentation & Measuremen
Building lighting energy consumption modelling with hybrid neural-statistic approaches
"In the proposed work we aim at modelling building lighting energy consumption. We compared several classical methods to the latest Artificial Intelligence. modelling technique: Artificial Neural Networks Ensembling (ANNE). Therefore, in this study we show how we built the ANNE and a new hybrid model based on the. statistical-ANNE combination. Experimentation has been carried out over a three. months data set coming from a real office building located in the ENEA ‘Casaccia’. Research Centre. Experimental results show that the proposed hybrid statistical-ANNE approach can get a remarkable improvement with respect to the best classical method(the statistical one).
Pushing The Limit of LLM Capacity for Text Classification
The value of text classification's future research has encountered challenges
and uncertainties, due to the extraordinary efficacy demonstrated by large
language models (LLMs) across numerous downstream NLP tasks. In this era of
open-ended language modeling, where task boundaries are gradually fading, an
urgent question emerges: have we made significant advances in text
classification under the full benefit of LLMs? To answer this question, we
propose RGPT, an adaptive boosting framework tailored to produce a specialized
text classification LLM by recurrently ensembling a pool of strong base
learners. The base learners are constructed by adaptively adjusting the
distribution of training samples and iteratively fine-tuning LLMs with them.
Such base learners are then ensembled to be a specialized text classification
LLM, by recurrently incorporating the historical predictions from the previous
learners. Through a comprehensive empirical comparison, we show that RGPT
significantly outperforms 8 SOTA PLMs and 7 SOTA LLMs on four benchmarks by
1.36% on average. Further evaluation experiments show a clear surpassing of
RGPT over human classification
Can input reconstruction be used to directly estimate uncertainty of a regression U-Net model? -- Application to proton therapy dose prediction for head and neck cancer patients
Estimating the uncertainty of deep learning models in a reliable and
efficient way has remained an open problem, where many different solutions have
been proposed in the literature. Most common methods are based on Bayesian
approximations, like Monte Carlo dropout (MCDO) or Deep ensembling (DE), but
they have a high inference time (i.e. require multiple inference passes) and
might not work for out-of-distribution detection (OOD) data (i.e. similar
uncertainty for in-distribution (ID) and OOD). In safety critical environments,
like medical applications, accurate and fast uncertainty estimation methods,
able to detect OOD data, are crucial, since wrong predictions can jeopardize
patients safety. In this study, we present an alternative direct uncertainty
estimation method and apply it for a regression U-Net architecture. The method
consists in the addition of a branch from the bottleneck which reconstructs the
input. The input reconstruction error can be used as a surrogate of the model
uncertainty. For the proof-of-concept, our method is applied to proton therapy
dose prediction in head and neck cancer patients. Accuracy, time-gain, and OOD
detection are analyzed for our method in this particular application and
compared with the popular MCDO and DE. The input reconstruction method showed a
higher Pearson correlation coefficient with the prediction error (0.620) than
DE and MCDO (between 0.447 and 0.612). Moreover, our method allows an easier
identification of OOD (Z-score of 34.05). It estimates the uncertainty
simultaneously to the regression task, therefore requires less time or
computational resources.Comment: 11 pages, 3 figures and 3 Table
- …