Holmes: An Efficient and Lightweight Semantic Based Anomalous Email
  Detector

Guo, Hui; Wu, Peilun; Yan, Fan

Holmes: An Efficient and Lightweight Semantic Based Anomalous Email Detector

Authors: Hui Guo
Peilun Wu
Fan Yan
Publication date: 2 December 2021
Publisher

Abstract

Email threat is a serious issue for enterprise security, which consists of various malicious scenarios, such as phishing, fraud, blackmail and malvertisement. Traditional anti-spam gateway commonly requires to maintain a greylist to filter out unexpected emails based on suspicious vocabularies existed in the mail subject and content. However, the signature-based approach cannot effectively discover novel and unknown suspicious emails that utilize various hot topics at present, such as COVID-19 and US election. To address the problem, in this paper, we present Holmes, an efficient and lightweight semantic based engine for anomalous email detection. Holmes can convert each event log of email to a sentence through word embedding then extract interesting items among them by novelty detection. Based on our observations, we claim that, in an enterprise environment, there is a stable relation between senders and receivers, but suspicious emails are commonly from unusual sources, which can be detected through the rareness selection. We evaluate the performance of Holmes in a real-world enterprise environment, in which it sends and receives around 5,000 emails each day. As a result, Holmes can achieve a high detection rate (output around 200 suspicious emails per day) and maintain a low false alarm rate for anomaly detection

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2104.08044

Last time updated on 03/06/2021