ADVERSARY AWARE CONTINUAL LEARNING

Abstract

Continual learning approaches are useful as they help the model to learn new information (classes) sequentially, while also retaining the previously acquired information (classes). However, these approaches are adversary agnostic, i.e., they do not consider the possibility of malicious attacks. In this dissertation, we have demonstrated that continual learning approaches are extremely vulnerable to the adversarial backdoor attacks, where an intelligent adversary can introduce small amount of misinformation to the model in the form of imperceptible backdoor pattern during training to cause deliberate forgetting of a specific class at test time. We then propose a novel defensive framework to counter such an insidious attack where, we use the attacker’s primary strength – hiding the backdoor pattern by making it imperceptible to humans – against it and propose to learn a perceptible (stronger) pattern (also during the training) that can overpower the attacker’s imperceptible (weaker) pattern. We demonstrate the effectiveness of the proposed defensive mechanism through various commonly used replay-based (both generative and exact replay-based) continual learning algorithms using CIFAR-10, CIFAR-100, and MNIST benchmark datasets. Most noteworthy, we show that our proposed defensive framework considerably improves the robustness of continual learning algorithms with ZERO knowledge of the attacker’s target task, attacker’s target class, shape, size, and location of the attacker’s pattern. The proposed defensive framework also does not depend on the underlying continual learning algorithm. We term our proposed defensive framework as Adversary Aware Continual Learning (AACL)

    Similar works