3,154 research outputs found
Metodologias para identificação de PC Zombies
Mestrado em Engenharia de Computadores e TelemáticaEste trabalho teve como objectivo o desenvolvimento de um sistema baseado em redes neuronais para a detenção automática de intrusões numa rede local de computadores.
O sistema desenvolvido foi testado num ambiente laboratorial onde foram simuladas intrusões e as acções sobre a rede que as mesmas proporcionaram. Numa primeira fase foram testados vários rootkits para gerar tráfego ilícito embebido em tráfego de aplicações lícitas. Seguidamente
utilizando os dados estatísticos recolhidos foi construída uma rede neuronal que depois de treinada mostrou-se capaz de identificar o tráfego que lhe é apresentado à entrada, distinguindo o tráfego que é gerado pelo rootkit como
ilícito e o tráfego gerado por uma aplicação de um utilizador autorizado como lícito.
Pretendeu-se com este trabalho iniciar o desenvolvimento de uma arquitectura que dê suporte as técnicas de identificação automática e
inteligente de intrusões, usando redes neuronais para a detecção de PC Zombies e BotNets.
ABSTRACT: The purpose of this work was to develop a system based on neural networks for the automatic detention of intrusion on a local network of
computers.
The system was tested in a laboratory environment, where was simulated some intrusions and actions on the network that this intrusions provided.
Initially where tested out several rootkits to generate illicit traffic (zombie) embedded on licit application traffic. Then, using the statistical data collected was built a neural network, that after trained was able to identify the traffic that
was presented at entry, distinguishing the traffic that was generated by the rootkit as illicit and traffic generated by an application of an authorized user as licit.
The main goal of this work is to develop an architecture that supports the techniques of automatic identification and intelligent intrusion detection by using neural networks to detect PC Zombies and BotNets
Behavioral analysis in cybersecurity using machine learning: a study based on graph representation, class imbalance and temporal dissection
The main goal of this thesis is to improve behavioral cybersecurity analysis using machine learning, exploiting graph structures, temporal dissection, and addressing imbalance problems.This main objective is divided into four specific goals:
OBJ1: To study the influence of the temporal resolution on highlighting micro-dynamics in the entity behavior classification problem. In real use cases, time-series information could be not enough for describing the entity behavior classification. For this reason, we plan to exploit graph structures for integrating both structured and unstructured data in a representation of entities and their relationships. In this way, it will be possible to appreciate not only the single temporal communication but the whole behavior of these entities. Nevertheless, entity behaviors evolve over time and therefore, a static graph may not be enoughto describe all these changes. For this reason, we propose to use a temporal dissection for creating temporal subgraphs and therefore, analyze the influence of the temporal resolution on the graph creation and the entity behaviors within. Furthermore, we propose to study how the temporal granularity should be used for highlighting network micro-dynamics and short-term behavioral changes which can be a hint of suspicious activities. OBJ2: To develop novel sampling methods that work with disconnected graphs for addressing imbalanced problems avoiding component topology changes. Graph imbalance problem is a very common and challenging task and traditional graph sampling techniques that work directly on these structures cannot be used without modifying the graph’s intrinsic information or introducing bias. Furthermore, existing techniques have shown to be limited when disconnected graphs are used. For this reason, novel resampling methods for balancing the number of nodes that can be directly applied over disconnected graphs, without altering component topologies, need to be introduced. In particular, we propose to take advantage of the existence of disconnected graphs to detect and replicate the most relevant graph components without changing their topology, while considering traditional data-level strategies for handling the entity behaviors within. OBJ3: To study the usefulness of the generative adversarial networks for addressing the class imbalance problem in cybersecurity applications. Although traditional data-level pre-processing techniques have shown to be effective for addressing class imbalance problems, they have also shown downside effects when highly variable datasets are used, as it happens in cybersecurity. For this reason, new techniques that can exploit the overall data distribution for learning highly variable behaviors should be investigated. In this sense, GANs have shown promising results in the image and video domain, however, their extension to tabular data is not trivial. For this reason, we propose to adapt GANs for working with cybersecurity data and exploit their ability in learning and reproducing the input distribution for addressing the class imbalance problem (as an oversampling technique). Furthermore, since it is not possible to find a unique GAN solution that works for every scenario, we propose to study several GAN architectures with several training configurations to detect which is the best option for a cybersecurity application. OBJ4: To analyze temporal data trends and performance drift for enhancing cyber threat analysis. Temporal dynamics and incoming new data can affect the quality of the predictions compromising the model reliability. This phenomenon makes models get outdated without noticing. In this sense, it is very important to be able to extract more insightful information from the application domain analyzing data trends, learning processes, and performance drifts over time. For this reason, we propose to develop a systematic approach for analyzing how the data quality and their amount affect the learning process. Moreover, in the contextof CTI, we propose to study the relations between temporal performance drifts and the input data distribution for detecting possible model limitations, enhancing cyber threat analysis.Programa de Doctorado en Ciencias y Tecnologías Industriales (RD 99/2011) Industria Zientzietako eta Teknologietako Doktoretza Programa (ED 99/2011
NetSentry: A deep learning approach to detecting incipient large-scale network attacks
Machine Learning (ML) techniques are increasingly adopted to tackle
ever-evolving high-profile network attacks, including DDoS, botnet, and
ransomware, due to their unique ability to extract complex patterns hidden in
data streams. These approaches are however routinely validated with data
collected in the same environment, and their performance degrades when deployed
in different network topologies and/or applied on previously unseen traffic, as
we uncover. This suggests malicious/benign behaviors are largely learned
superficially and ML-based Network Intrusion Detection System (NIDS) need
revisiting, to be effective in practice. In this paper we dive into the
mechanics of large-scale network attacks, with a view to understanding how to
use ML for Network Intrusion Detection (NID) in a principled way. We reveal
that, although cyberattacks vary significantly in terms of payloads, vectors
and targets, their early stages, which are critical to successful attack
outcomes, share many similarities and exhibit important temporal correlations.
Therefore, we treat NID as a time-sensitive task and propose NetSentry, perhaps
the first of its kind NIDS that builds on Bidirectional Asymmetric LSTM
(Bi-ALSTM), an original ensemble of sequential neural models, to detect network
threats before they spread. We cross-evaluate NetSentry using two practical
datasets, training on one and testing on the other, and demonstrate F1 score
gains above 33% over the state-of-the-art, as well as up to 3 times higher
rates of detecting attacks such as XSS and web bruteforce. Further, we put
forward a novel data augmentation technique that boosts the generalization
abilities of a broad range of supervised deep learning algorithms, leading to
average F1 score gains above 35%
Deteção de atividades ilícitas de software Bots através do DNS
DNS is a critical component of the Internet where almost all Internet applications
and organizations rely on. Its shutdown can deprive them from being part of the
Internet, and hence, DNS is usually the only protocol to be allowed when Internet
access is firewalled. The constant exposure of this protocol to external entities force
corporations to always be observant of external rogue software that may misuse
the DNS to establish covert channels and perform multiple illicit activities, such as
command and control and data exfiltration.
Most current solutions for bot malware and botnet detection are based on Deep
Packet Inspection techniques, such as analyzing DNS query payloads, which may
reveal private and sensitive information. In addiction, the majority of existing solutions
do not consider the usage of licit and encrypted DNS traffic, where Deep
Packet Inspection techniques are impossible to be used.
This dissertation proposes mechanisms to detect malware bots and botnet behaviors
on DNS traffic that are robust to encrypted DNS traffic and that ensure the
privacy of the involved entities by analyzing instead the behavioral patterns of DNS
communications using descriptive statistics over collected network metrics such as
packet rates, packet lengths, and silence and activity periods. After characterizing
DNS traffic behaviors, a study of the processed data is conducted, followed by the
training of Novelty Detection algorithms with the processed data.
Models are trained with licit data gathered from multiple licit activities, such as
reading the news, studying, and using social networks, in multiple operating systems,
browsers, and configurations. Then, the models were tested with similar
data, but containing bot malware traffic. Our tests show that our best performing
models achieve detection rates in the order of 99%, and 92% for malware bots
using low throughput rates.
This work ends with some ideas for a more realistic generation of bot malware
traffic, as the current DNS Tunneling tools are limited when mimicking licit DNS
usages, and for a better detection of malware bots that use low throughput rates.O DNS é um componente crítico da Internet, já que quase todas as aplicações
e organizações que a usam dependem dele para funcionar. A sua privação pode
deixá-las de fazerem parte da Internet, e por causa disso, o DNS é normalmente
o único protocolo permitido quando o acesso à Internet está restrito. A exposição
constante deste protocolo a entidades externas obrigam corporações a estarem
sempre atentas a software externo ilícito que pode fazer uso indevido do DNS para
estabelecer canais secretos e realizar várias atividades ilícitas, como comando e
controlo e exfiltração de dados.
A maioria das soluções atuais para detecção de malware bots e de botnets são
baseadas em técnicas inspeção profunda de pacotes, como analizar payloads de
pedidos de DNS, que podem revelar informação privada e sensitiva. Além disso,
a maioria das soluções existentes não consideram o uso lícito e cifrado de tráfego
DNS, onde técnicas como inspeção profunda de pacotes são impossíveis de serem
usadas.
Esta dissertação propõe mecanismos para detectar comportamentos de malware
bots e botnets que usam o DNS, que são robustos ao tráfego DNS cifrado e
que garantem a privacidade das entidades envolvidas ao analizar, em vez disso,
os padrões comportamentais das comunicações DNS usando estatística descritiva
em métricas recolhidas na rede, como taxas de pacotes, o tamanho dos pacotes,
e os tempos de atividade e silêncio. Após a caracterização dos comportamentos
do tráfego DNS, um estudo sobre os dados processados é realizado, sendo depois
usados para treinar os modelos de Detecção de Novidades.
Os modelos são treinados com dados lícitos recolhidos de multiplas atividades
lícitas, como ler as notícias, estudar, e usar redes sociais, em multiplos sistemas
operativos e com multiplas configurações. De seguida, os modelos são testados
com dados lícitos semelhantes, mas contendo também tráfego de malware bots.
Os nossos testes mostram que com modelos de Detecção de Novidades é possível
obter taxas de detecção na ordem dos 99%, e de 98% para malware bots que geram
pouco tráfego.
Este trabalho finaliza com algumas ideas para uma geração de tráfego ilícito mais
realista, já que as ferramentas atuais de DNS tunneling são limitadas quando
usadas para imitar usos de DNS lícito, e para uma melhor deteção de situações
onde malware bots geram pouco tráfego.Mestrado em Engenharia de Computadores e Telemátic
- …