634 research outputs found

    Active Learning through Adaptive Heterogeneous Ensembling

    Full text link

    Ensemble deep learning: A review

    Get PDF
    Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into ensemble models like bagging, boosting and stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models. Application of deep ensemble models in different domains is also briefly discussed. Finally, we conclude this paper with some future recommendations and research directions

    PMU measurements based short-term voltage stability assessment of power systems via deep transfer learning

    Full text link
    Deep learning has emerged as an effective solution for addressing the challenges of short-term voltage stability assessment (STVSA) in power systems. However, existing deep learning-based STVSA approaches face limitations in adapting to topological changes, sample labeling, and handling small datasets. To overcome these challenges, this paper proposes a novel phasor measurement unit (PMU) measurements-based STVSA method by using deep transfer learning. The method leverages the real-time dynamic information captured by PMUs to create an initial dataset. It employs temporal ensembling for sample labeling and utilizes least squares generative adversarial networks (LSGAN) for data augmentation, enabling effective deep learning on small-scale datasets. Additionally, the method enhances adaptability to topological changes by exploring connections between different faults. Experimental results on the IEEE 39-bus test system demonstrate that the proposed method improves model evaluation accuracy by approximately 20% through transfer learning, exhibiting strong adaptability to topological changes. Leveraging the self-attention mechanism of the Transformer model, this approach offers significant advantages over shallow learning methods and other deep learning-based approaches.Comment: Accepted by IEEE Transactions on Instrumentation & Measuremen

    Building lighting energy consumption modelling with hybrid neural-statistic approaches

    Get PDF
    "In the proposed work we aim at modelling building lighting energy consumption. We compared several classical methods to the latest Artificial Intelligence. modelling technique: Artificial Neural Networks Ensembling (ANNE). Therefore, in this study we show how we built the ANNE and a new hybrid model based on the. statistical-ANNE combination. Experimentation has been carried out over a three. months data set coming from a real office building located in the ENEA ‘Casaccia’. Research Centre. Experimental results show that the proposed hybrid statistical-ANNE approach can get a remarkable improvement with respect to the best classical method(the statistical one).

    Pushing The Limit of LLM Capacity for Text Classification

    Full text link
    The value of text classification's future research has encountered challenges and uncertainties, due to the extraordinary efficacy demonstrated by large language models (LLMs) across numerous downstream NLP tasks. In this era of open-ended language modeling, where task boundaries are gradually fading, an urgent question emerges: have we made significant advances in text classification under the full benefit of LLMs? To answer this question, we propose RGPT, an adaptive boosting framework tailored to produce a specialized text classification LLM by recurrently ensembling a pool of strong base learners. The base learners are constructed by adaptively adjusting the distribution of training samples and iteratively fine-tuning LLMs with them. Such base learners are then ensembled to be a specialized text classification LLM, by recurrently incorporating the historical predictions from the previous learners. Through a comprehensive empirical comparison, we show that RGPT significantly outperforms 8 SOTA PLMs and 7 SOTA LLMs on four benchmarks by 1.36% on average. Further evaluation experiments show a clear surpassing of RGPT over human classification

    Can input reconstruction be used to directly estimate uncertainty of a regression U-Net model? -- Application to proton therapy dose prediction for head and neck cancer patients

    Full text link
    Estimating the uncertainty of deep learning models in a reliable and efficient way has remained an open problem, where many different solutions have been proposed in the literature. Most common methods are based on Bayesian approximations, like Monte Carlo dropout (MCDO) or Deep ensembling (DE), but they have a high inference time (i.e. require multiple inference passes) and might not work for out-of-distribution detection (OOD) data (i.e. similar uncertainty for in-distribution (ID) and OOD). In safety critical environments, like medical applications, accurate and fast uncertainty estimation methods, able to detect OOD data, are crucial, since wrong predictions can jeopardize patients safety. In this study, we present an alternative direct uncertainty estimation method and apply it for a regression U-Net architecture. The method consists in the addition of a branch from the bottleneck which reconstructs the input. The input reconstruction error can be used as a surrogate of the model uncertainty. For the proof-of-concept, our method is applied to proton therapy dose prediction in head and neck cancer patients. Accuracy, time-gain, and OOD detection are analyzed for our method in this particular application and compared with the popular MCDO and DE. The input reconstruction method showed a higher Pearson correlation coefficient with the prediction error (0.620) than DE and MCDO (between 0.447 and 0.612). Moreover, our method allows an easier identification of OOD (Z-score of 34.05). It estimates the uncertainty simultaneously to the regression task, therefore requires less time or computational resources.Comment: 11 pages, 3 figures and 3 Table
    • …
    corecore