Search CORE

2 research outputs found

Modelling email traffic workloads with RNN and LSTM models

Author: Boukoros S.
Dixon M.
Koutsakis P.
McGill T.
Nugaliyadde A.
Om K.
Wong K.W.
Publication venue: SpringerOpen
Publication date: 01/01/2020
Field of study

Analysis of time series data has been a challenging research subject for decades. Email traffic has recently been modelled as a time series function using a Recurrent Neural Network (RNN) and RNNs were shown to provide higher prediction accuracy than previous probabilistic models from the literature. Given the exponential rise of email workloads which need to be handled by email servers, in this paper we first present and discuss the literature on modelling email traffic. We then explain the advantages and limitations of different approaches as well as their points of agreement and disagreement. Finally, we present a comprehensive comparison between the performance of RNN and Long Short Term Memory (LSTM) models. Our experimental results demonstrate that both approaches can achieve high accuracy over four large datasets acquired from different universities’ servers, outperforming existing work, and show that the use of LSTM and RNN is very promising for modelling email traffic

Research Repository

A new highly accurate workload model for campus email traffic

Author: Boukoros S.
Kalampogia A.
Koutsakis P.
Publication venue
Publication date: 01/01/2016
Field of study

E-mail has become a de-facto means of communication. Mail servers try to manage the explosive growth of e-mail usage and offer users good quality of service, while spam e-mails are expected to account for 90% of the e-mail traffic. The exceedingly heavy workload can lead to the replacement of existing e-mail servers due to their inability to cope with performance standards and storing capacity. In this study, we focus on modeling the workload of the email servers of a medium-sized Greek university, for all types of traffic (user and system e-mails, as well as spam). We collected a vast amount of e-mail logs with high variations in terms of size and volume over time. We tested some of the most popular distributions for workload characterization and used powerful statistical tests to evaluate our findings. Interestingly we come to different conclusions in comparison with previous works in the field. Our work indicates that, with the exception of some outliers, campus email traffic can be modeled and predicted quite accurately

Research Repository