85 research outputs found
Countering internet packet classifiers to improve user online privacy
Internet traffic classification or packet classification is the act of classifying packets using the extracted statistical data from the transmitted packets on a computer network. Internet traffic classification is an essential tool for Internet service providers to manage network traffic, provide users with the intended quality of service (QoS), and perform surveillance. QoS measures prioritize a network\u27s traffic type over other traffic based on preset criteria; for instance, it gives higher priority or bandwidth to video traffic over website browsing traffic. Internet packet classification methods are also used for automated intrusion detection. They analyze incoming traffic patterns and identify malicious packets used for denial of service (DoS) or similar attacks. Internet traffic classification may also be used for website fingerprinting attacks in which an intruder analyzes encrypted traffic of a user to find behavior or usage patterns and infer the user\u27s online activities.
Protecting users\u27 online privacy against traffic classification attacks is the primary motivation of this work. This dissertation shows the effectiveness of machine learning algorithms in identifying user traffic by comparing 11 state-of-art classifiers and proposes three anonymization methods for masking generated user network traffic to counter the Internet packet classifiers. These methods are equalized packet length, equalized packet count, and equalized inter-arrival times of TCP packets. This work compares the results of these anonymization methods to show their effectiveness in reducing machine learning algorithms\u27 performance for traffic classification. The results are validated using newly generated user traffic.
Additionally, a novel model based on a generative adversarial network (GAN) is introduced to automate countering the adversarial traffic classifiers. This model, which is called GAN tunnel, generates pseudo traffic patterns imitating the distributions of the real traffic generated by actual applications and encapsulates the actual network packets into the generated traffic packets. The GAN tunnel\u27s performance is tested against random forest and extreme gradient boosting (XGBoost) traffic classifiers. These classifiers are shown not being able of detecting the actual source application of data exchanged in the GAN tunnel in the tested scenarios in this thesis
Neural Stochastic Differential Equations with Change Points: A Generative Adversarial Approach
Stochastic differential equations (SDEs) have been widely used to model real
world random phenomena. Existing works mainly focus on the case where the time
series is modeled by a single SDE, which might be restrictive for modeling time
series with distributional shift. In this work, we propose a change point
detection algorithm for time series modeled as neural SDEs. Given a time series
dataset, the proposed method jointly learns the unknown change points and the
parameters of distinct neural SDE models corresponding to each change point.
Specifically, the SDEs are learned under the framework of generative
adversarial networks (GANs) and the change points are detected based on the
output of the GAN discriminator in a forward pass. At each step of the proposed
algorithm, the change points and the SDE model parameters are updated in an
alternating fashion. Numerical results on both synthetic and real datasets are
provided to validate the performance of our algorithm in comparison to
classical change point detection benchmarks, standard GAN-based neural SDEs,
and other state-of-the-art deep generative models for time series data.Comment: accepted paper to be published in the proceedings of ICASSP 202
- …