38 research outputs found
Deteção de atividades ilícitas de software Bots através do DNS
DNS is a critical component of the Internet where almost all Internet applications
and organizations rely on. Its shutdown can deprive them from being part of the
Internet, and hence, DNS is usually the only protocol to be allowed when Internet
access is firewalled. The constant exposure of this protocol to external entities force
corporations to always be observant of external rogue software that may misuse
the DNS to establish covert channels and perform multiple illicit activities, such as
command and control and data exfiltration.
Most current solutions for bot malware and botnet detection are based on Deep
Packet Inspection techniques, such as analyzing DNS query payloads, which may
reveal private and sensitive information. In addiction, the majority of existing solutions
do not consider the usage of licit and encrypted DNS traffic, where Deep
Packet Inspection techniques are impossible to be used.
This dissertation proposes mechanisms to detect malware bots and botnet behaviors
on DNS traffic that are robust to encrypted DNS traffic and that ensure the
privacy of the involved entities by analyzing instead the behavioral patterns of DNS
communications using descriptive statistics over collected network metrics such as
packet rates, packet lengths, and silence and activity periods. After characterizing
DNS traffic behaviors, a study of the processed data is conducted, followed by the
training of Novelty Detection algorithms with the processed data.
Models are trained with licit data gathered from multiple licit activities, such as
reading the news, studying, and using social networks, in multiple operating systems,
browsers, and configurations. Then, the models were tested with similar
data, but containing bot malware traffic. Our tests show that our best performing
models achieve detection rates in the order of 99%, and 92% for malware bots
using low throughput rates.
This work ends with some ideas for a more realistic generation of bot malware
traffic, as the current DNS Tunneling tools are limited when mimicking licit DNS
usages, and for a better detection of malware bots that use low throughput rates.O DNS é um componente crítico da Internet, já que quase todas as aplicações
e organizações que a usam dependem dele para funcionar. A sua privação pode
deixá-las de fazerem parte da Internet, e por causa disso, o DNS é normalmente
o único protocolo permitido quando o acesso à Internet está restrito. A exposição
constante deste protocolo a entidades externas obrigam corporações a estarem
sempre atentas a software externo ilícito que pode fazer uso indevido do DNS para
estabelecer canais secretos e realizar várias atividades ilícitas, como comando e
controlo e exfiltração de dados.
A maioria das soluções atuais para detecção de malware bots e de botnets são
baseadas em técnicas inspeção profunda de pacotes, como analizar payloads de
pedidos de DNS, que podem revelar informação privada e sensitiva. Além disso,
a maioria das soluções existentes não consideram o uso lícito e cifrado de tráfego
DNS, onde técnicas como inspeção profunda de pacotes são impossíveis de serem
usadas.
Esta dissertação propõe mecanismos para detectar comportamentos de malware
bots e botnets que usam o DNS, que são robustos ao tráfego DNS cifrado e
que garantem a privacidade das entidades envolvidas ao analizar, em vez disso,
os padrões comportamentais das comunicações DNS usando estatística descritiva
em métricas recolhidas na rede, como taxas de pacotes, o tamanho dos pacotes,
e os tempos de atividade e silêncio. Após a caracterização dos comportamentos
do tráfego DNS, um estudo sobre os dados processados é realizado, sendo depois
usados para treinar os modelos de Detecção de Novidades.
Os modelos são treinados com dados lícitos recolhidos de multiplas atividades
lícitas, como ler as notícias, estudar, e usar redes sociais, em multiplos sistemas
operativos e com multiplas configurações. De seguida, os modelos são testados
com dados lícitos semelhantes, mas contendo também tráfego de malware bots.
Os nossos testes mostram que com modelos de Detecção de Novidades é possível
obter taxas de detecção na ordem dos 99%, e de 98% para malware bots que geram
pouco tráfego.
Este trabalho finaliza com algumas ideas para uma geração de tráfego ilícito mais
realista, já que as ferramentas atuais de DNS tunneling são limitadas quando
usadas para imitar usos de DNS lícito, e para uma melhor deteção de situações
onde malware bots geram pouco tráfego.Mestrado em Engenharia de Computadores e Telemátic
Network communication privacy: traffic masking against traffic analysis
An increasing number of recent experimental works have been demonstrating the supposedly secure channels in the Internet are prone to privacy breaking under many respects, due to traffic features leaking information on the user activity and traffic content. As a matter of example, traffic flow classification at application level, web page identification, language/phrase detection in VoIP communications have all been successfully demonstrated against encrypted channels. In this thesis I aim at understanding if and how complex it is to obfuscate the information leaked by traffic features, namely packet lengths, direction, times. I define a security model that points out what the ideal target of masking is, and then define the optimized and practically implementable masking algorithms, yielding a trade-off between privacy and overhead/complexity of the masking algorithm. Numerical results are based on measured Internet traffic traces. Major findings are that: i) optimized full masking achieves similar overhead values with padding only and in case fragmentation is allowed; ii) if practical realizability is accounted for, optimized statistical masking algorithms attain only moderately better overhead than simple fixed pattern masking algorithms, while still leaking correlation information that can be exploited by the adversary
A Review on Features’ Robustness in High Diversity Mobile Traffic Classifications
Mobile traffics are becoming more dominant due to growing usage of mobile devices and proliferation of IoT. The influx of mobile traffics introduce some new challenges in traffic classifications; namely the diversity complexity and behavioral dynamism complexity. Existing traffic classifications methods are designed for classifying standard protocols and user applications with more deterministic behaviors in small diversity. Currently, flow statistics, payload signature and heuristic traffic attributes are some of the most effective features used to discriminate traffic classes. In this paper, we investigate the correlations of these features to the less-deterministic user application traffic classes based on corresponding classification accuracy. Then, we evaluate the impact of large-scale classification on feature's robustness based on sign of diminishing accuracy. Our experimental results consolidate the needs for unsupervised feature learning to address the dynamism of mobile application behavioral traits for accurate classification on rapidly growing mobile traffics
TIE: A Community-Oriented Traffic Classification Platform
Abstract — During the last years the research on network traffic classification has become very active. The research community, moved by increasing difficulties in the automated identification of network traffic and by concerns related to user privacy, started to investigate and propose classification approaches alternative to port-based and payload-based techniques. Despite the large quantity of works published in the past few years on this topic, very few implementations targeting alternative approaches were made available to the community. Moreover, most approaches proposed in literature suffer of problems related to the ability of evaluating and comparing them. In this paper we present a novel community-oriented software for traffic classification called TIE, which aims at becoming a common tool for the fair evaluation and comparison of different techniques and at fostering the sharing of common implementations and data. Moreover, TIE supports the combi-nation of more classification plugins in order to build multi-classifier systems, and its architecture is designed to allow online traffic classification. In this paper, we also present the implementation of two basic techniques as classification plugins, which are already distributed with TIE. Finally we report on the development of several classification plugins, implementing novel classification techniques, carried out through collaborations with other research groups. I
Tunneling Activities Detection Using Machine Learning Techniques, Journal of Telecommunications and Information Technology, 2011, nr 1
Tunnel establishment, like HTTPS tunnel or related ones, between a computer protected by a security gateway and a remote server located outside the protected network is the most effective way to bypass the network security policy. Indeed, a permitted protocol can be used to embed a forbidden one until the remote server. Therefore, if the resulting information flow is ciphered, security standard tools such as application level gateways (ALG), firewalls, intrusion detection system (IDS), do not detect this violation. In this paper, we describe a statistical analysis of ciphered flows that allows detection of the carried inner protocol. Regarding the deployed security policy, this technology could be added in security tools to detect forbidden protocols usages. In the defence domain, this technology could help preventing information leaks through side channels. At the end of this article, we present a tunnel detection tool architecture and the results obtained with our approach on a public database containing real data flows
Recommended from our members
Design and Implementation of Algorithms for Traffic Classification
Traffic analysis is the practice of using inherent characteristics of a network flow such as timings, sizes, and orderings of the packets to derive sensitive information about it. Traffic analysis techniques are used because of the extensive adoption of encryption and content-obfuscation mechanisms, making it impossible to infer any information about the flows by analyzing their content. In this thesis, we use traffic analysis to infer sensitive information for different objectives and different applications. Specifically, we investigate various applications: p2p cryptocurrencies, flow correlation, and messaging applications. Our goal is to tailor specific traffic analysis algorithms that best capture network traffic’s intrinsic characteristics in those applications for each of these applications. Also, the objective of traffic analysis is different for each of these applications. Specifically, in Bitcoin, our goal is to evaluate Bitcoin traffic’s resilience to blocking by powerful entities such as governments and ISPs. Bitcoin and similar cryptocurrencies play an important role in electronic commerce and other trust-based distributed systems because of their significant advantage over traditional currencies, including open access to global e-commerce. Therefore, it is essential to
the consumers and the industry to have reliable access to their Bitcoin assets. We also examine stepping stone attacks for flow correlation. A stepping stone is a host that an attacker uses to relay her traffic to hide her identity. We introduce two fingerprinting systems, TagIt and FINN. TagIt embeds a secret fingerprint into the flows by moving the packets to specific time intervals. However, FINN utilizes DNNs to embed the fingerprint by changing the inter-packet delays (IPDs) in the flow. In messaging applications, we analyze the WhatsApp messaging service to determine if traffic leaks any sensitive information such as members’ identity in a particular conversation to the adversaries who watch their encrypted traffic. These messaging applications’ privacy is essential because these services provide an environment to dis- cuss politically sensitive subjects, making them a target to government surveillance and censorship in totalitarian countries. We take two technical approaches to design our traffic analysis techniques. The increasing use of DNN-based classifiers inspires our first direction: we train DNN classifiers to perform some specific traffic analysis task. Our second approach is to inspect and model the shape of traffic in the target application and design a statistical classifier for the expected shape of traffic. DNN- based methods are useful when the network is complex, and the traffic’s underlying noise is not linear. Also, these models do not need a meticulous analysis to extract the features. However, deep learning techniques need a vast amount of training data to work well. Therefore, they are not beneficial when there is insufficient data avail- able to train a generalized model. On the other hand, statistical methods have the advantage that they do not have training overhead
APIC: A method for automated pattern identification and classification
Machine Learning (ML) is a transformative technology at the forefront of many modern research endeavours. The technology is generating a tremendous amount of attention from researchers and practitioners, providing new approaches to solving complex classification and regression tasks. While concepts such as Deep Learning have existed for many years, the computational power for realising the utility of these algorithms in real-world applications has only recently become available. This dissertation investigated the efficacy of a novel, general method for deploying ML in a variety of complex tasks, where best feature selection, data-set labelling, model definition and training processes were determined automatically. Models were developed in an iterative fashion, evaluated using both training and validation data sets. The proposed method was evaluated using three distinct case studies, describing complex classification tasks often requiring significant input from human experts. The results achieved demonstrate that the proposed method compares with, and often outperforms, less general, comparable methods designed specifically for each task. Feature selection, data-set annotation, model design and training processes were optimised by the method, where less complex, comparatively accurate classifiers with lower dependency on computational power and human expert intervention were produced. In chapter 4, the proposed method demonstrated improved efficacy over comparable systems, automatically identifying and classifying complex application protocols traversing IP networks. In chapter 5, the proposed method was able to discriminate between normal and anomalous traffic, maintaining accuracy in excess of 99%, while reducing false alarms to a mere 0.08%. Finally, in chapter 6, the proposed method discovered more optimal classifiers than those implemented by comparable methods, with classification scores rivalling those achieved by state-of-the-art systems. The findings of this research concluded that developing a fully automated, general method, exhibiting efficacy in a wide variety of complex classification tasks with minimal expert intervention, was possible. The method and various artefacts produced in each case study of this dissertation are thus significant contributions to the field of ML
Automatic network traffic classification
The thesis addresses a number of critical problems in regard to fully automating the process of network traffic classification and protocol identification. Several effective solutions based on statistical analysis and machine learning techniques are proposed, which significantly reduce the requirements for human interventions in network traffic classification systems
Developing an Effective Detection Framework for Targeted Ransomware Attacks in Brownfield Industrial Internet of Things
The Industrial Internet of Things (IIoT) is being interconnected with many critical industrial activities, creating major cyber security concerns. The key concern is with edge systems of Brownfield IIoT, where new devices and technologies are deployed to interoperate with legacy industrial control systems and leverage the benefits of IoT. These edge devices, such as edge gateways, have opened the way to advanced attacks such as targeted ransomware. Various pre-existing security solutions can detect and mitigate such attacks but are often ineffective due to the heterogeneous nature of the IIoT devices and protocols and their interoperability demands. Consequently, developing new detection solutions is essential. The key challenges in developing detection solutions for targeted ransomware attacks in IIoT systems include 1) understanding attacks and their behaviour, 2) designing accurate IIoT system models to test attacks, 3) obtaining realistic data representing IIoT systems' activities and connectivities, and 4) identifying attacks.
This thesis provides important contributions to the research focusing on investigating targeted ransomware attacks against IIoT edge systems and developing a new detection framework. The first contribution is developing the world's first example of ransomware, specifically targeting IIoT edge gateways. The experiments' results demonstrate that such an attack is now possible on edge gateways. Also, the kernel-related activity parameters appear to be significant indicators of the crypto-ransomware attacks' behaviour, much more so than for similar attacks in workstations. The second contribution is developing a new holistic end-to-end IIoT security testbed (i.e., Brown-IIoTbed) that can be easily reproduced and reconfigured to support new processes and security scenarios. The results prove that Brown-IIoTbed operates efficiently in terms of its functions and security testing.
The third contribution is generating a first-of-its-kind dataset tailored for IIoT systems covering targeted ransomware attacks and their activities, called X-IIoTID. The dataset includes connectivity- and device-agnostic features collected from various data sources. The final contribution is developing a new asynchronous peer-to-peer federated deep learning framework tailored for IIoT edge gateways for detecting targeted ransomware attacks. The framework's effectiveness has been evaluated against pre-existing datasets and the newly developed X-IIoTID dataset