21 research outputs found

    Automatic parsing of binary-based application protocols using network traffic

    Get PDF
    A method for analyzing a binary-based application protocol of a network. The method includes obtaining conversations from the network, extracting content of a candidate field from a message in each conversation, calculating a randomness measure of the content to represent a level of randomness of the content across all conversation, calculating a correlation measure of the content to represent a level of correlation, across all of conversations, between the content and an attribute of a corresponding conversation where the message containing the candidate field is located, and selecting, based on the randomness measure and the correlation measure, and using a pre-determined field selection criterion, the candidate offset from a set of candidate offsets as the offset defined by the protocol

    Link homophily in the application layer and its usage in traffic classification

    Get PDF
    Abstract-This paper addresses the following questions. Is there link homophily in the application layer traffic? If so, can it be used to accurately classify traffic in network trace data without relying on payloads or properties at the flow level? Our research shows that the answers to both of these questions are affirmative in real network trace data. Specifically, we define link homophily to be the tendency for flows with common IP hosts to have the same application (P2P, Web, etc.) compared to randomly selected flows. The presence of link homophily in trace data provides us with statistical dependencies between flows that share common IP hosts. We utilize these dependencies to classify application layer traffic without relying on payloads or properties at the flow level. In particular, we introduce a new statistical relational learning algorithm, called Neighboring Link Classifier with Relaxation Labeling (NLC+RL). Our algorithm has no training phase and does not require features to be constructed. All that it needs to start the classification process is traffic information on a small portion of the initial flows, which we refer to as seeds. In all our traces, NLC+RL achieves above 90% accuracy with less than 5% seed size; it is robust to errors in the seeds and various seed-selection biases; and it is able to accurately classify challenging traffic such as P2P with over 90% Precision and Recall

    Analyzing Network-Wide Interactions Using Graphs: Techniques and Applications

    No full text
    The fundamental problem that motivates this dissertation is the need for better methods and tools to manage and protect large IP networks. In such networks, it is essential for administrators to profile the traffic generated by different applications (e.g., Web, BitTorrent, FTP) and be able to identify the packets of an application in the wild. This enables administrators to effectively accomplish the following key tasks: (a) Manage the network: It allows different policies to be applied to different applications, e.g., rate limit peer-to-peer (P2P) traffic during busy hours. (b) Protect the network: Profiling malicious traffic requires a strong separation from benign traffic; therefore, knowing the behavior of "good" application provides better separation from malicious activity. Despite some significant efforts to solve the traffic profiling problem, none of the existing methods address all relevant problems. The difficulty of the problem comes from the following three factors: (a) The intentions of application writers and users to hide their traffic using obfuscation (e.g., payload encryption); (b) The limited information about flows and IP-hosts when traffic is monitored at the Internet backbone; and (c) The continuous appearance of new applications as well as undocumented changes to existing network protocols. In this dissertation, we propose a different way of looking at network traffic that focuses on the network-wide interactions of IP-hosts (as seen at a router). To facilitate the analysis of network-wide interactions, we represent traffic as a graph, where each node is an IP address, and each edge represents a type of interaction between two nodes. We use the term Traffic Dispersion Graph or TDG to refer to such a graph. Intuitively, TDGs capture the "social behavior" of network hosts, which, as we show here, it is hard to obfuscate. For example, a P2P protocol cannot function while trying to hide its overlay network, as maintaining a network overlay is a fundamental behavior of a P2P protocol. This dissertation focuses on three key aspects of network-wide interactions: (a) The graph shapes and structures formed by different applications; (b) The distinctive dynamic network-wide behavior of network application (i.e., how the graphs change over time); and (c) The identification of communities formed by IP-hosts over the Internet. Using the traffic analysis techniques we propose here, we develop novel traffic profiling solutions that are robust to obfuscation and can operate at the backbone, which are both very challenging to address with the current state-of-the-art. To evaluate the effectiveness of our methods, we use real-world traffic traces collected from six different networks. This dissertation presents the first work to explore the full capabilities of TDGs for profiling and analyzing traffic. Based on our results, we believe that TDGs can provide the basis for the next generation of traffic monitoring tools

    Exploiting Dynamicity in Graph-based Traffic Analysis: Techniques and Applications

    No full text
    Network traffic can be represented by a Traffic Dispersion Graph (TDG) that contains an edge between two nodes that send a particular type of traffic (e.g., DNS) to one another. TDGs have recently been proposed as an alternative way to interpret and visualize network traffic. Previous studies have focused on static properties of TDGs using graph snapshots in isolation. In this work, we represent network traffic with a series of related graph instances that change over time. This representation facilitates the analysis of the dynamic nature of network traffic, providing additional descriptive power. For example, DNS and P2P graph instances can appear similar when compared in isolation, but the way the DNS and P2P TDGs change over time differs significantly. To quantify the changes over time, we introduce a series of novel metrics that capture changes both in the graph structure (e.g., the average degree) and the participants (i.e., IP addresses) of a TDG. We apply our new methodologies to improve graph-based traffic classification and to detect changes in the profile of legacy applications (e.g., e-mail)

    Network Monitoring using Traffic Dispersion Graphs (TDGs)

    No full text
    Monitoring network traffic and detecting unwanted applications has become a challenging problem, since many applications obfuscate their traffic using unregistered port numbers or payload encryption. Apart from some notable exceptions, most traffic monitoring tools use two types of approaches: (a) keeping traffic statistics such as packet sizes and interarrivals, flow counts, byte volumes, etc., or (b) analyzing packet content. In this paper, we propose the use of Traffic Dispersion Graphs (TDGs) as a way to monitor, analyze, and visualize network traffic. TDGs model the social behavior of hosts (“who talks to whom”), where the edges can be defined to represent different interactions (e.g. the exchange of a certain number or type of packets). With the introduction of TDGs, we are able to harness a wealth of tools and graph modeling techniques from a diverse set of disciplines

    BiToS: enhancing BitTorrent for supporting streaming applications

    No full text
    Abstract — BitTorrent (BT) in the last years has been one of the most effective mechanisms for P2P content distribution. Although BT was created for distribution of time insensitive content, in this work we try to identify what are the minimal changes needed in the BT’s mechanisms in order to support streaming. The importance of this capability is that the peer will now have the ability to start enjoying the video before the complete download of the video file. This ability is particularly important in highly polluted environments, since the peer can evaluate the quality of the video content early and thus preserve its valuable resources. In a nutshell, our approach gives higher download priority to pieces that are close to be reproduced by the player. This comes in contrast to the original BT protocol, where pieces are downloaded in an out-of-order manner based solely on their rareness. In particular, our approach tries to strike the balance between downloading pieces in: (a) playing order, enabling smooth playback, and (b) the rarest first order, enabling the use of parallel downloading of pieces. In this work, we introduce three different Piece Selection mechanisms and we evaluate them through simulations based on how well they deliver streaming services to the peers. I

    Denial of Service Attacks in Wireless Networks: The case of Jammers 1

    No full text
    Abstract—The shared nature of the medium in wireless networks makes it easy for an adversary to launch a Wireless Denial of Service (WDoS) attack. Recent studies, demonstrate that such attacks can be very easily accomplished using off-the-shelf equipment. To give a simple example, a malicious node can continually transmit a radio signal in order to block any legitimate access to the medium and/or interfere with reception. This act is called jamming and the malicious nodes are referred to as jammers. Jamming techniques vary from simple ones based on the continual transmission of interference signals, to more sophisticated attacks that aim at exploiting vulnerabilities of the particular protocol used. In this survey, we present a detailed up-to-date discussion on the jamming attacks recorded in the literature. We also describe various techniques proposed for detecting the presence of jammers. Finally, we survey numerous mechanisms which attempt to protect the network from jamming attacks. We conclude with a summary and by suggesting future directions

    Graph-Based Analysis and Prediction for Software Evolution

    No full text
    Abstract—We exploit recent advances in analysis of graph topology to better understand software evolution, and to construct predictors that facilitate software development and maintenance. Managing an evolving, collaborative software system is a complex and expensive process, which still cannot ensure software reliability. Emerging techniques in graph mining have revolutionized the modeling of many complex systems and processes. We show how we can use a graph-based characterization of a software system to capture its evolution and facilitate development, by helping us estimate bug severity, prioritize refactoring efforts, and predict defect-prone releases. Our work consists of three main thrusts. First, we construct graphs that capture software structure at two different levels: (a) the product, i.e., source code and module level, and (b) the process, i.e., developer collaboration level. We identify a set of graph metrics that capture interesting properties of these graphs. Second, we study the evolution of eleven open source programs, including Firefox, Eclipse, MySQL, over the lifespan of the programs, typically a decade or more. Third, we show how our graph metrics can be used to construct predictors for bug severity, high-maintenance software parts, and failureprone releases. Our work strongly suggests that using graph topology analysis concepts can open many actionable avenues in software engineering research and practice. Keywords-Graph science; software evolution; software quality; defect prediction; productivity metrics; empirical studies I
    corecore