BOTection: bot detection by building Markov Chain models of bots network behavior

Beigi Elaheh Biglar; Dokas Paul; Gu Guofei; Gu Guofei; Gu Guofei; Khattak Sheharbano; Nazario J; Onwuzurike Lucky; Pendlebury Feargus; Perdisci Roberto; Rossow Christian; Stinson Elizabeth; Taylor Vincent F; Vormayr Gernot; Zander Sebastian; Zubair Rafique M

BOTection: bot detection by building Markov Chain models of bots network behavior

Authors: Beigi Elaheh Biglar
Dokas Paul
Gu Guofei
Gu Guofei
Gu Guofei
Khattak Sheharbano
Nazario J
Onwuzurike Lucky
Pendlebury Feargus
Perdisci Roberto
Rossow Christian
Stinson Elizabeth
Taylor Vincent F
Vormayr Gernot
Zander Sebastian
Zubair Rafique M
Publication date: 1 January 2020
Publisher: 'Association for Computing Machinery (ACM)'
Doi

Abstract

This paper was presented at the 15th ACM ASIA Conference on Computer and Communications Security (ACM ASIACCS 2020), 5-9 October 2020, Taipei, Taiwan. This is the accepted manuscript version of the paper. The final version is available online from the Association for Computing Machinery at: https://doi.org/10.1145/3320269.3372202.Botnets continue to be a threat to organizations, thus various machine learning-based botnet detectors have been proposed. However, the capability of such systems in detecting new or unseen botnets is crucial to ensure its robustness against the rapid evolution of botnets. Moreover, it prolongs the effectiveness of the system in detecting bots, avoiding frequent and time-consuming classifier re-training. We present BOTection, a privacy-preserving bot detection system that models the bot network flow behavior as a Markov Chain. The Markov Chain state transitions capture the bots' network behavior using high-level flow features as states, producing content-agnostic and encryption resilient behavioral features. These features are used to train a classifier to first detect flows produced by bots, and then identify their bot families. We evaluate our system on a dataset of over 7M malicious flows from 12 botnet families, showing its capability of detecting bots' network traffic with 99.78% F-measure and classifying it to a malware family with a 99.09% F-measure. Notably, due to the modeling of general bot network behavior by the Markov Chains, BOTection can detect traffic belonging to unseen bot families with an F-measure of 93.03% making it robust against malware evolution.Accepted manuscrip