3 research outputs found

    Design and Implementation of Algorithms for Traffic Classification

    Get PDF
    Traffic analysis is the practice of using inherent characteristics of a network flow such as timings, sizes, and orderings of the packets to derive sensitive information about it. Traffic analysis techniques are used because of the extensive adoption of encryption and content-obfuscation mechanisms, making it impossible to infer any information about the flows by analyzing their content. In this thesis, we use traffic analysis to infer sensitive information for different objectives and different applications. Specifically, we investigate various applications: p2p cryptocurrencies, flow correlation, and messaging applications. Our goal is to tailor specific traffic analysis algorithms that best capture network traffic’s intrinsic characteristics in those applications for each of these applications. Also, the objective of traffic analysis is different for each of these applications. Specifically, in Bitcoin, our goal is to evaluate Bitcoin traffic’s resilience to blocking by powerful entities such as governments and ISPs. Bitcoin and similar cryptocurrencies play an important role in electronic commerce and other trust-based distributed systems because of their significant advantage over traditional currencies, including open access to global e-commerce. Therefore, it is essential to the consumers and the industry to have reliable access to their Bitcoin assets. We also examine stepping stone attacks for flow correlation. A stepping stone is a host that an attacker uses to relay her traffic to hide her identity. We introduce two fingerprinting systems, TagIt and FINN. TagIt embeds a secret fingerprint into the flows by moving the packets to specific time intervals. However, FINN utilizes DNNs to embed the fingerprint by changing the inter-packet delays (IPDs) in the flow. In messaging applications, we analyze the WhatsApp messaging service to determine if traffic leaks any sensitive information such as members’ identity in a particular conversation to the adversaries who watch their encrypted traffic. These messaging applications’ privacy is essential because these services provide an environment to dis- cuss politically sensitive subjects, making them a target to government surveillance and censorship in totalitarian countries. We take two technical approaches to design our traffic analysis techniques. The increasing use of DNN-based classifiers inspires our first direction: we train DNN classifiers to perform some specific traffic analysis task. Our second approach is to inspect and model the shape of traffic in the target application and design a statistical classifier for the expected shape of traffic. DNN- based methods are useful when the network is complex, and the traffic’s underlying noise is not linear. Also, these models do not need a meticulous analysis to extract the features. However, deep learning techniques need a vast amount of training data to work well. Therefore, they are not beneficial when there is insufficient data avail- able to train a generalized model. On the other hand, statistical methods have the advantage that they do not have training overhead

    Dynamic Protocol Reverse Engineering a Grammatical Inference Approach

    Get PDF
    Round trip engineering of software from source code and reverse engineering of software from binary files have both been extensively studied and the state-of-practice have documented tools and techniques. Forward engineering of protocols has also been extensively studied and there are firmly established techniques for generating correct protocols. While observation of protocol behavior for performance testing has been studied and techniques established, reverse engineering of protocol control flow from observations of protocol behavior has not received the same level of attention. State-of-practice in reverse engineering the control flow of computer network protocols is comprised of mostly ad hoc approaches. We examine state-of-practice tools and techniques used in three open source projects: Pidgin, Samba, and rdesktop . We examine techniques proposed by computational learning researchers for grammatical inference. We propose to extend the state-of-art by inferring protocol control flow using grammatical inference inspired techniques to reverse engineer automata representations from captured data flows. We present evidence that grammatical inference is applicable to the problem domain under consideration
    corecore