research

Online Training and Sanitization of AD Systems

Abstract

In this paper, we introduce novel techniques that enhance the training phase of Anomaly Detection (AD) sensors. Our aim is to both improve the detection performance and protect against attacks that target the training dataset. Our approach is two pronged: we employ a novel sanitization method for large training datasets that removes attacks and traffic artifacts by measuring their frequency and position inside the dataset. Furthermore, we extend the training phase in the spatial dimension to include model information from other collaborative systems. We demonstrate that by doing so we can protect all the participating systems against targeted training attacks. Another aspect of our system is its ability to adapt and update the normality model when there is a shift in the nature of inspected traffic that reflects actual changes in the back-end servers. Such "on-line" training appears to be the "Achilles' heel" of AD sensors because they fail to adapt when there is a legitimate deviation in the traffic behavior, thereby flooding the operator with false positives. To counter that, we discuss the integration of what we call a shadow sensor with the AD system. This sensor complements our techniques by acting as an oracle to analyze and classify the resulting "suspect data" identified by the AD sensor. We show that our techniques can be applied to a wide range of unmodified AD sensors without incurring significant additional computational cost beyond the initial training phase

    Similar works