381 research outputs found
Scalable Recollections for Continual Lifelong Learning
Given the recent success of Deep Learning applied to a variety of single
tasks, it is natural to consider more human-realistic settings. Perhaps the
most difficult of these settings is that of continual lifelong learning, where
the model must learn online over a continuous stream of non-stationary data. A
successful continual lifelong learning system must have three key capabilities:
it must learn and adapt over time, it must not forget what it has learned, and
it must be efficient in both training time and memory. Recent techniques have
focused their efforts primarily on the first two capabilities while questions
of efficiency remain largely unexplored. In this paper, we consider the problem
of efficient and effective storage of experiences over very large time-frames.
In particular we consider the case where typical experiences are O(n) bits and
memories are limited to O(k) bits for k << n. We present a novel scalable
architecture and training algorithm in this challenging domain and provide an
extensive evaluation of its performance. Our results show that we can achieve
considerable gains on top of state-of-the-art methods such as GEM.Comment: AAAI 201
Uncertainty Estimation, Explanation and Reduction with Insufficient Data
Human beings have been juggling making smart decisions under uncertainties, where we manage to trade off between swift actions and collecting sufficient evidence. It is naturally expected that a generalized artificial intelligence (GAI) to navigate through uncertainties meanwhile predicting precisely. In this thesis, we aim to propose strategies that underpin machine learning with uncertainties from three perspectives: uncertainty estimation, explanation and reduction. Estimation quantifies the variability in the model inputs and outputs. It can endow us to evaluate the model predictive confidence. Explanation provides a tool to interpret the mechanism of uncertainties and to pinpoint the potentials for uncertainty reduction, which focuses on stabilizing model training, especially when the data is insufficient. We hope that this thesis can motivate related studies on quantifying predictive uncertainties in deep learning. It also aims to raise awareness for other stakeholders in the fields of smart transportation and automated medical diagnosis where data insufficiency induces high uncertainty.
The thesis is dissected into the following sections: Introduction. we justify the necessity to investigate AI uncertainties and clarify the challenges existed in the latest studies, followed by our research objective. Literature review. We break down the the review of the state-of-the-art methods into uncertainty estimation, explanation and reduction. We make comparisons with the related fields encompassing meta learning, anomaly detection, continual learning as well. Uncertainty estimation. We introduce a variational framework, neural process that approximates Gaussian processes to handle uncertainty estimation. Two variants from the neural process families are proposed to enhance neural processes with scalability and continual learning. Uncertainty explanation. We inspect the functional distribution of neural processes to discover the global and local factors that affect the degree of predictive uncertainties. Uncertainty reduction. We validate the proposed uncertainty framework on two scenarios: urban irregular behaviour detection and neurological disorder diagnosis, where the intrinsic data insufficiency undermines the performance of existing deep learning models. Conclusion. We provide promising directions for future works and conclude the thesis
Federated Unlearning: A Survey on Methods, Design Guidelines, and Evaluation Metrics
Federated Learning (FL) enables collaborative training of a Machine Learning
(ML) model across multiple parties, facilitating the preservation of users' and
institutions' privacy by keeping data stored locally. Instead of centralizing
raw data, FL exchanges locally refined model parameters to build a global model
incrementally. While FL is more compliant with emerging regulations such as the
European General Data Protection Regulation (GDPR), ensuring the right to be
forgotten in this context - allowing FL participants to remove their data
contributions from the learned model - remains unclear. In addition, it is
recognized that malicious clients may inject backdoors into the global model
through updates, e.g. to generate mispredictions on specially crafted data
examples. Consequently, there is the need for mechanisms that can guarantee
individuals the possibility to remove their data and erase malicious
contributions even after aggregation, without compromising the already acquired
"good" knowledge. This highlights the necessity for novel Federated Unlearning
(FU) algorithms, which can efficiently remove specific clients' contributions
without full model retraining. This survey provides background concepts,
empirical evidence, and practical guidelines to design/implement efficient FU
schemes. Our study includes a detailed analysis of the metrics for evaluating
unlearning in FL and presents an in-depth literature review categorizing
state-of-the-art FU contributions under a novel taxonomy. Finally, we outline
the most relevant and still open technical challenges, by identifying the most
promising research directions in the field.Comment: 23 pages, 8 figures, and 6 table
- …