67 research outputs found
Identificação de aplicações de vídeo em canais protegidos com aprendizagem automática
As encrypted traffic is becoming a standard and traffic obfuscation techniques become more accessible and common, companies are struggling to enforce their network usage policies and ensure optimal operational network performance. Users are more technologically knowledgeable, being able to circumvent web content filtering tools with the usage of protected tunnels such as VPNs. Consequently, techniques such as DPI, which already were considered outdated due to their impracticality, become even more ineffective. Furthermore, the continuous regulations being established by governments and
international unions regarding citizen privacy rights makes network monitoring increasingly challenging. This work presents a scalable and easily deployable network-based framework for application identification in a corporate environment, focusing on video applications. This framework should be effective regardless of the environment and network setup, with the objective of being a useful tool in the network monitoring process. The proposed framework offers a compromise between allowing network supervision and assuring workers’ privacy. The results evaluation indicates that we can identify web services that are running over a protected channel with an accuracy of 95%, using low-level packet information that does not jeopardize sensitive worker data.Com a adoção de tráfego cifrado a tornar-se a norma e a crescente utilização de técnicas de obfuscação de tráfego, as empresas têm cada vez mais dificuldades em aplicar políticas de uso nas suas redes, bem como garantir o seu bom funcionamento. Os utilizadores têm mais conhecimentos tecnológicos, sendo facilmente capazes de contornar ferramentas de filtros de conteúdo online com a utilização de túneis protegidos como VPNs. Consequentemente, técnicas como DPI, que já estão ultrapassadas devido à sua
impraticabilidade, tornam-se cada vez mais ineficazes. Além disso, todos os regulamentos que têm vindo a ser estabelecidos por governos e organizações internacionais sobre a privacidade dos cidadãos tornam a tarefa de monitorização de uma rede cada vez mais difícil. Este documento apresenta uma
plataforma escalável e facilmente instalável para identificação de aplicações numa rede empresarial, focando-se em aplicações de vídeo. Esta abordagem deve ser eficaz independentemente do contexto e organização da rede, com o objectivo de ser uma ferramenta útil no processo de supervisão de redes.
O modelo proposto oferece um compromisso entre a capacidade de supervisionar uma rede e assegurar a privacidade dos trabalhadores. A avaliação de resultados indica que é possível identificar serviços web em ligações estabelecidas sobre canais protegidos com uma precisão geral de 95%, usando informações de baixo-nível dos pacotes que não comprometem informação sensível dos trabalhadores.Mestrado em Engenharia de Computadores e Telemátic
Network communication privacy: traffic masking against traffic analysis
An increasing number of recent experimental works have been demonstrating the supposedly secure channels in the Internet are prone to privacy breaking under many respects, due to traffic features leaking information on the user activity and traffic content. As a matter of example, traffic flow classification at application level, web page identification, language/phrase detection in VoIP communications have all been successfully demonstrated against encrypted channels. In this thesis I aim at understanding if and how complex it is to obfuscate the information leaked by traffic features, namely packet lengths, direction, times. I define a security model that points out what the ideal target of masking is, and then define the optimized and practically implementable masking algorithms, yielding a trade-off between privacy and overhead/complexity of the masking algorithm. Numerical results are based on measured Internet traffic traces. Major findings are that: i) optimized full masking achieves similar overhead values with padding only and in case fragmentation is allowed; ii) if practical realizability is accounted for, optimized statistical masking algorithms attain only moderately better overhead than simple fixed pattern masking algorithms, while still leaking correlation information that can be exploited by the adversary
Recommended from our members
Design and Implementation of Algorithms for Traffic Classification
Traffic analysis is the practice of using inherent characteristics of a network flow such as timings, sizes, and orderings of the packets to derive sensitive information about it. Traffic analysis techniques are used because of the extensive adoption of encryption and content-obfuscation mechanisms, making it impossible to infer any information about the flows by analyzing their content. In this thesis, we use traffic analysis to infer sensitive information for different objectives and different applications. Specifically, we investigate various applications: p2p cryptocurrencies, flow correlation, and messaging applications. Our goal is to tailor specific traffic analysis algorithms that best capture network traffic’s intrinsic characteristics in those applications for each of these applications. Also, the objective of traffic analysis is different for each of these applications. Specifically, in Bitcoin, our goal is to evaluate Bitcoin traffic’s resilience to blocking by powerful entities such as governments and ISPs. Bitcoin and similar cryptocurrencies play an important role in electronic commerce and other trust-based distributed systems because of their significant advantage over traditional currencies, including open access to global e-commerce. Therefore, it is essential to
the consumers and the industry to have reliable access to their Bitcoin assets. We also examine stepping stone attacks for flow correlation. A stepping stone is a host that an attacker uses to relay her traffic to hide her identity. We introduce two fingerprinting systems, TagIt and FINN. TagIt embeds a secret fingerprint into the flows by moving the packets to specific time intervals. However, FINN utilizes DNNs to embed the fingerprint by changing the inter-packet delays (IPDs) in the flow. In messaging applications, we analyze the WhatsApp messaging service to determine if traffic leaks any sensitive information such as members’ identity in a particular conversation to the adversaries who watch their encrypted traffic. These messaging applications’ privacy is essential because these services provide an environment to dis- cuss politically sensitive subjects, making them a target to government surveillance and censorship in totalitarian countries. We take two technical approaches to design our traffic analysis techniques. The increasing use of DNN-based classifiers inspires our first direction: we train DNN classifiers to perform some specific traffic analysis task. Our second approach is to inspect and model the shape of traffic in the target application and design a statistical classifier for the expected shape of traffic. DNN- based methods are useful when the network is complex, and the traffic’s underlying noise is not linear. Also, these models do not need a meticulous analysis to extract the features. However, deep learning techniques need a vast amount of training data to work well. Therefore, they are not beneficial when there is insufficient data avail- able to train a generalized model. On the other hand, statistical methods have the advantage that they do not have training overhead
Addressing Insider Threats from Smart Devices
Smart devices have unique security challenges and are becoming increasingly common. They have been used in the past to launch cyber attacks such as the Mirai attack. This work is focused on solving the threats posed to and by smart devices inside a network. The size of the problem is quantified; the initial compromise is prevented where possible, and compromised devices are identified.
To gain insight into the size of the problem, campus Domain Name System (DNS) measurements were taken that allow for wireless traffic to be separated from wired traffic. Two-thirds of the DNS traffic measured came from wireless hosts, implying that mobile devices are playing a bigger role in networks. Also, port scans and service discovery protocols were used to identify Internet of Things (IoT) devices on the campus network and follow-up work was done to assess the state of the IoT devices.
Motivated by these findings, three solutions were developed. To handle the scenario when compromised mobile devices are connected to the network, a new strategy for steppingstone detection was developed with both an application layer and a transport layer solution. The proposed solution is effective even when the mobile device cellular connection is used. Also, malicious or vulnerable applications make it through the mobile app store vetting process. A user space tool was developed that identifies apps contacting malicious domains in real time and collects data for research purposes. Malicious app behavior can then be identified on the user’s device, catching malicious apps that were overlooked by software vetting. Last, the variety of IoT device types and manufacturers makes the job of keeping them secure difficult. A generic framework was developed to lighten the management burden of securing IoT devices, serve as a middle box to secure legacy devices, and also use DNS queries as a way to identify misbehaving devices
Practical Analysis of Encrypted Network Traffic
The growing use of encryption in network communications is an undoubted boon for user privacy. However, the limitations of real-world encryption schemes are still not well understood, and new side-channel attacks against encrypted communications are disclosed every year. Furthermore, encrypted network communications, by preventing inspection of packet contents, represent a significant challenge from a network security perspective: our existing infrastructure relies on such inspection for threat detection. Both problems are exacerbated by the increasing prevalence of encrypted traffic: recent estimates suggest that 65% or more of downstream Internet traffic will be encrypted by the end of 2016. This work addresses these problems by expanding our understanding of the properties and characteristics of encrypted network traffic and exploring new, specialized techniques for the handling of encrypted traffic by network monitoring systems. We first demonstrate that opaque traffic, of which encrypted traffic is a subset, can be identified in real-time and how this ability can be leveraged to improve the capabilities of existing IDS systems. To do so, we evaluate and compare multiple methods for rapid identification of opaque packets, ultimately pinpointing a simple hypothesis test (which can be implemented on an FPGA) as an efficient and effective detector of such traffic. In our experiments, using this technique to “winnow”, or filter, opaque packets from the traffic load presented to an IDS system significantly increased the throughput of the system, allowing the identification of many more potential threats than the same system without winnowing. Second, we show that side channels in encrypted VoIP traffic enable the reconstruction of approximate transcripts of conversations. Our approach leverages techniques from linguistics, machine learning, natural language processing, and machine translation to accomplish this task despite the limited information leaked by such side channels. Our ability to do so underscores both the potential threat to user privacy which such side channels represent and the degree to which this threat has been underestimated. Finally, we propose and demonstrate the effectiveness of a new paradigm for identifying HTTP resources retrieved over encrypted connections. Our experiments demonstrate how the predominant paradigm from prior work fails to accurately represent real-world situations and how our proposed approach offers significant advantages, including the ability to infer partial information, in comparison. We believe these results represent both an enhanced threat to user privacy and an opportunity for network monitors and analysts to improve their own capabilities with respect to encrypted traffic.Doctor of Philosoph
Network Traffic Analysis Using Stochastic Grammars
Network traffic analysis is widely used to infer information from Internet traffic. This is possible even if the traffic is encrypted. Previous work uses traffic characteristics, such as port numbers, packet sizes, and frequency, without looking for more subtle patterns in the network traffic. In this work, we use stochastic grammars, hidden Markov models (HMMs) and probabilistic context-free grammars (PCFGs), as pattern recognition tools for traffic analysis. HMMs are widely used for pattern recognition and detection. We use a HMM inference approach. With inferred HMMs, we use confidence intervals (CI) to detect if a data sequence matches the HMM. To compare HMMs, we define a normalized Markov metric. A statistical test is used to determine model equivalence. Our metric systematically removes the least likely events from both HMMs until the remaining models are statistically equivalent. This defines the distance between models. We extend the use of HMMs to PCFGs, which have more expressive power. We estimate PCFG production probabilities from data. A statistical test is used for detection. We present three applications of HMM and PCFG detection to network traffic analysis. First, we infer the presence of protocol tunneling through Tor (the onion router) anonymization network. The Markov metric quantifies the similarity of network traffic HMMs in Tor to identify the protocol. It also measures communication noise in Tor network. We use HMMs to detect centralized botnet traffic. We infer HMMs from botnet traffic data and detect botnet infections. Experimental results show that HMMs can accurately detect Zeus botnet traffic. To hide their locations better, newer botnets have P2P control structures. Hierarchical P2P botnets contain recursive and hierarchical patterns. We use PCFGs to detect P2P botnet traffic. Experimentation on real-world traffic data shows that PCFGs can accurately differentiate between P2P botnet traffic and normal Internet traffic
Towards the Deployment of Machine Learning Solutions in Network Traffic Classification: A Systematic Survey
International audienceTraffic analysis is a compound of strategies intended to find relationships, patterns, anomalies, and misconfigurations, among others things, in Internet traffic. In particular, traffic classification is a subgroup of strategies in this field that aims at identifying the application's name or type of Internet traffic. Nowadays, traffic classification has become a challenging task due to the rise of new technologies, such as traffic encryption and encapsulation, which decrease the performance of classical traffic classification strategies. Machine Learning gains interest as a new direction in this field, showing signs of future success, such as knowledge extraction from encrypted traffic, and more accurate Quality of Service management. Machine Learning is fast becoming a key tool to build traffic classification solutions in real network traffic scenarios; in this sense, the purpose of this investigation is to explore the elements that allow this technique to work in the traffic classification field. Therefore, a systematic review is introduced based on the steps to achieve traffic classification by using Machine Learning techniques. The main aim is to understand and to identify the procedures followed by the existing works to achieve their goals. As a result, this survey paper finds a set of trends derived from the analysis performed on this domain; in this manner, the authors expect to outline future directions for Machine Learning based traffic classification
- …