2,230 research outputs found

    No NAT'd User left Behind: Fingerprinting Users behind NAT from NetFlow Records alone

    Full text link
    It is generally recognized that the traffic generated by an individual connected to a network acts as his biometric signature. Several tools exploit this fact to fingerprint and monitor users. Often, though, these tools assume to access the entire traffic, including IP addresses and payloads. This is not feasible on the grounds that both performance and privacy would be negatively affected. In reality, most ISPs convert user traffic into NetFlow records for a concise representation that does not include, for instance, any payloads. More importantly, large and distributed networks are usually NAT'd, thus a few IP addresses may be associated to thousands of users. We devised a new fingerprinting framework that overcomes these hurdles. Our system is able to analyze a huge amount of network traffic represented as NetFlows, with the intent to track people. It does so by accurately inferring when users are connected to the network and which IP addresses they are using, even though thousands of users are hidden behind NAT. Our prototype implementation was deployed and tested within an existing large metropolitan WiFi network serving about 200,000 users, with an average load of more than 1,000 users simultaneously connected behind 2 NAT'd IP addresses only. Our solution turned out to be very effective, with an accuracy greater than 90%. We also devised new tools and refined existing ones that may be applied to other contexts related to NetFlow analysis

    Increasing Performances of TCP Data Transfers Through Multiple Parallel Connections

    Get PDF
    Although Transmission Control Protocol (TCP) is a widely deployed and successful protocol, it shows some limitations in present-day environments. In particular, it is unable to exploit multiple (physical or logical) paths between two hosts. This paper presents PATTHEL, a session-layer solution designed for parallelizing stream data transfers. Parallelization is achieved by striping the data flow among multiple TCP channels. This solution does not require invasive changes to the networking stack and can be implemented entirely in user space. Moreover, it is flexible enough to suit several scenarios - e.g. it can be used to split a data transfer among multiple relays within a peer-to-peer overlay networ

    Increasing Performances of TCP Data Transfers Through Multiple Parallel Connections

    Get PDF
    Although Transmission Control Protocol (TCP) is a widely deployed and successful protocol, it shows some limitations in present-day environments. In particular, it is unable to exploit multiple (physical or logical) paths between two hosts. This paper presents PATTHEL, a session-layer solution designed for parallelizing stream data transfers. Parallelization is achieved by striping the data flow among multiple TCP channels. This solution does not require invasive changes to the networking stack and can be implemented entirely in user space. Moreover, it is flexible enough to suit several scenarios - e.g. it can be used to split a data transfer among multiple relays within a peer-to-peer overlay network

    Analysis of Web Protocols Evolution on Internet Traffic

    Get PDF
    This research focus on the analysis of ten years of Internet traffic, from 2004 until 2013, captured and measured by Mawi Lab at a link connecting Japan to the United States of America. The collected traffic was analysed for each of the days in that period, and conjointly in that timeframe. Initial research questions included the test of the hypothesis of weather the change in Internet applications and Internet usage patterns were observable in the generated traffic or not. Several protocols were thoroughly analysed, including HTTP, HTTPS, TCP, UDP, IPv4, IPv6, SMTP, DNS. The effect of the transition from IPv4 to IPv6 was also analysed. Conclusions were drawn and the research questions were answered and the research hypothesis was confirmed.Esta pesquisa foca-se na análise de dez anos de tráfego de Internet, a partir de 2004 até 2013, capturado e medido pelo Mawi Lab numa ligação de fibra óptica entre o Japão e os Estados Unidos da América. O tráfego recolhido foi analisado para cada um dos dias nesse período, e também conjuntamente nesse período. As questões de pesquisa iniciais incluíram testar a hipótese de ser observável no tráfego gerado, a alteração das aplicações em uso na Internet e a alteração dos padrões de uso da Internet. Vários protocolos foram analisados exaustivamente, incluindo HTTP, HTTPS, TCP, UDP, IPv4, IPv6, SMTP e DNS. O efeito da transição do IPv4 para o IPv6 também foi analisado. As conclusões foram tiradas, as questões de pesquisa foram respondidas e a hipótese de pesquisa foi confirmada

    NetSentry: A deep learning approach to detecting incipient large-scale network attacks

    Get PDF
    Machine Learning (ML) techniques are increasingly adopted to tackle ever-evolving high-profile network attacks, including DDoS, botnet, and ransomware, due to their unique ability to extract complex patterns hidden in data streams. These approaches are however routinely validated with data collected in the same environment, and their performance degrades when deployed in different network topologies and/or applied on previously unseen traffic, as we uncover. This suggests malicious/benign behaviors are largely learned superficially and ML-based Network Intrusion Detection System (NIDS) need revisiting, to be effective in practice. In this paper we dive into the mechanics of large-scale network attacks, with a view to understanding how to use ML for Network Intrusion Detection (NID) in a principled way. We reveal that, although cyberattacks vary significantly in terms of payloads, vectors and targets, their early stages, which are critical to successful attack outcomes, share many similarities and exhibit important temporal correlations. Therefore, we treat NID as a time-sensitive task and propose NetSentry, perhaps the first of its kind NIDS that builds on Bidirectional Asymmetric LSTM (Bi-ALSTM), an original ensemble of sequential neural models, to detect network threats before they spread. We cross-evaluate NetSentry using two practical datasets, training on one and testing on the other, and demonstrate F1 score gains above 33% over the state-of-the-art, as well as up to 3 times higher rates of detecting attacks such as XSS and web bruteforce. Further, we put forward a novel data augmentation technique that boosts the generalization abilities of a broad range of supervised deep learning algorithms, leading to average F1 score gains above 35%

    Cross-tier application and data partitioning of web applications for hybrid cloud deployment

    Get PDF
    Hybrid cloud deployment offers flexibility in trade-offs between the cost-savings/scalability of the public cloud and control over data resources provided at a private premise. However, this flexibility comes at the expense of complexity in distributing a system over these two locations. For multi-tier web applications, this challenge manifests itself primarily in the partitioning of application- and database-tiers. While there is existing research that focuses on either application-tier or data-tier partitioning, we show that optimized partitioning of web applications benefits from both tiers being considered simultaneously. We present our research on a new cross-tier partitioning approach to help developers make effective trade-offs between performance and cost in a hybrid cloud deployment. In two case studies the approach results in up to 54% reduction in monetary costs compared to a premise only deployment and 56% improvement in execution time compared to a naïve partitioning where application-tier is deployed in the cloud and data-tier is on private infrastructure

    KISS: Stochastic Packet Inspection Classifier for UDP Traffic

    Get PDF
    This paper proposes KISS, a novel Internet classifica- tion engine. Motivated by the expected raise of UDP traffic, which stems from the momentum of Peer-to-Peer (P2P) streaming appli- cations, we propose a novel classification framework that leverages on statistical characterization of payload. Statistical signatures are derived by the means of a Chi-Square-like test, which extracts the protocol "format," but ignores the protocol "semantic" and "synchronization" rules. The signatures feed a decision process based either on the geometric distance among samples, or on Sup- port Vector Machines. KISS is very accurate, and its signatures are intrinsically robust to packet sampling, reordering, and flow asym- metry, so that it can be used on almost any network. KISS is tested in different scenarios, considering traditional client-server proto- cols, VoIP, and both traditional and new P2P Internet applications. Results are astonishing. The average True Positive percentage is 99.6%, with the worst case equal to 98.1,% while results are al- most perfect when dealing with new P2P streaming applications
    corecore