9 research outputs found
Efficient Learning of Communication Profiles from IP Flow Records
The task of network traffic monitoring has evolved drastically with the ever-increasing amount of data flowing in large scale networks. The automated analysis of this tremendous source of information often comes with using simpler models on aggregated data (e.g. IP flow records) due to time and space constraints. A step towards utilizing IP flow records more effectively are stream learning techniques. We propose a method to collect a limited yet relevant amount of data in order to learn a class of complex models, finite state machines, in real-time. These machines are used as communication profiles to fingerprint, identify or classify hosts and services and offer high detection rates while requiring less training data and thus being faster to compute than simple models
Profiling Users by Modeling Web Transactions
Users of electronic devices, e.g., laptop, smartphone, etc. have
characteristic behaviors while surfing the Web. Profiling this behavior can
help identify the person using a given device. In this paper, we introduce a
technique to profile users based on their web transactions. We compute several
features extracted from a sequence of web transactions and use them with
one-class classification techniques to profile a user. We assess the efficacy
and speed of our method at differentiating 25 users on a dataset representing 6
months of web traffic monitoring from a small company network.Comment: Extended technical report of an IEEE ICDCS 2017 publicatio
BotGM: Unsupervised Graph Mining to Detect Botnets in Traffic Flows
International audienceBotnets are one of the most dangerous and serious cybersecurity threats since they are a major vector of large-scale attack campaigns such as phishing, distributed denial-of-service (DDoS) attacks, trojans, spams, etc. A large body of research has been accomplished on botnet detection, but recent security incidents show that there are still several challenges remaining to be addressed, such as the ability to develop detectors which can cope with new types of botnets. In this paper, we propose BotGM, a new approach to detect botnet activities based on behavioral analysis of network traffic flow. BotGM identifies network traffic behavior using graph-based mining techniques to detect botnets behaviors and model the dependencies among flows to trace-back the root causes then. We applied BotGM on a publicly available large dataset of Botnet network flows, where it detects various botnet behaviors with a high accuracy without any prior knowledge of them
Performance Evaluation of Network Anomaly Detection Systems
Nowadays, there is a huge and growing concern about security in information and communication
technology (ICT) among the scientific community because any attack or anomaly in
the network can greatly affect many domains such as national security, private data storage,
social welfare, economic issues, and so on. Therefore, the anomaly detection domain is a broad
research area, and many different techniques and approaches for this purpose have emerged
through the years.
Attacks, problems, and internal failures when not detected early may badly harm an
entire Network system. Thus, this thesis presents an autonomous profile-based anomaly detection
system based on the statistical method Principal Component Analysis (PCADS-AD). This
approach creates a network profile called Digital Signature of Network Segment using Flow Analysis
(DSNSF) that denotes the predicted normal behavior of a network traffic activity through
historical data analysis. That digital signature is used as a threshold for volume anomaly detection
to detect disparities in the normal traffic trend. The proposed system uses seven traffic flow
attributes: Bits, Packets and Number of Flows to detect problems, and Source and Destination IP
addresses and Ports, to provides the network administrator necessary information to solve them.
Via evaluation techniques, addition of a different anomaly detection approach, and
comparisons to other methods performed in this thesis using real network traffic data, results
showed good traffic prediction by the DSNSF and encouraging false alarm generation and detection
accuracy on the detection schema.
The observed results seek to contribute to the advance of the state of the art in methods
and strategies for anomaly detection that aim to surpass some challenges that emerge from
the constant growth in complexity, speed and size of today’s large scale networks, also providing
high-value results for a better detection in real time.Atualmente, existe uma enorme e crescente preocupação com segurança em tecnologia
da informação e comunicação (TIC) entre a comunidade cientÃfica. Isto porque qualquer
ataque ou anomalia na rede pode afetar a qualidade, interoperabilidade, disponibilidade, e integridade
em muitos domÃnios, como segurança nacional, armazenamento de dados privados,
bem-estar social, questões econômicas, e assim por diante. Portanto, a deteção de anomalias
é uma ampla área de pesquisa, e muitas técnicas e abordagens diferentes para esse propósito
surgiram ao longo dos anos.
Ataques, problemas e falhas internas quando não detetados precocemente podem prejudicar
gravemente todo um sistema de rede. Assim, esta Tese apresenta um sistema autônomo
de deteção de anomalias baseado em perfil utilizando o método estatÃstico Análise de Componentes
Principais (PCADS-AD). Essa abordagem cria um perfil de rede chamado Assinatura Digital
do Segmento de Rede usando Análise de Fluxos (DSNSF) que denota o comportamento normal
previsto de uma atividade de tráfego de rede por meio da análise de dados históricos. Essa
assinatura digital é utilizada como um limiar para deteção de anomalia de volume e identificar
disparidades na tendência de tráfego normal. O sistema proposto utiliza sete atributos de fluxo
de tráfego: bits, pacotes e número de fluxos para detetar problemas, além de endereços IP e
portas de origem e destino para fornecer ao administrador de rede as informações necessárias
para resolvê-los.
Por meio da utilização de métricas de avaliação, do acrescimento de uma abordagem
de deteção distinta da proposta principal e comparações com outros métodos realizados nesta
tese usando dados reais de tráfego de rede, os resultados mostraram boas previsões de tráfego
pelo DSNSF e resultados encorajadores quanto a geração de alarmes falsos e precisão de deteção.
Com os resultados observados nesta tese, este trabalho de doutoramento busca contribuir
para o avanço do estado da arte em métodos e estratégias de deteção de anomalias,
visando superar alguns desafios que emergem do constante crescimento em complexidade, velocidade
e tamanho das redes de grande porte da atualidade, proporcionando também alta
performance. Ainda, a baixa complexidade e agilidade do sistema proposto contribuem para
que possa ser aplicado a deteção em tempo real
QUANTIFYING AND PREDICTING USER REPUTATION IN A NETWORK SECURITY CONTEXT
Reputation has long been an important factor for establishing trust and evaluating the character of others. Though subjective by definition, it recently emerged in the field of cybersecurity as a metric to quantify and predict the nature of domain names, IP addresses, files, and more. Implicit in the use of reputation to enhance cybersecurity is the assumption that past behaviors and opinions of others provides insight into the expected future behavior of an entity, which can be used to proactively identify potential threats to cybersecurity. Despite the plethora of work in industry and academia on reputation in cyberspace, proposed methods are often presented as black boxes and lack scientific rigor, reproducibility, and validation. Moreover, despite widespread recognition that cybersecurity solutions must consider the human user, there is limited work focusing on user reputation in a security context. This dissertation presents a mathematical interpretation of user cyber reputation and a methodology for evaluating reputation in a network security context. A user’s cyber reputation is defined as the most likely probability the user demonstrates a specific characteristic on the network, based on evidence. The methodology for evaluating user reputation is presented in three phases: characteristic definition and evidence collection; reputation quantification and prediction; and reputation model validation and refinement. The methodology is illustrated through a case study on a large university network, where network traffic data is used as evidence to determine the likelihood a user becomes infected or remains uninfected on the network. A separate case study explores social media as an alternate source of data for evaluating user reputation. User-reported account compromise data is collected from Twitter and used to predict if a user will self-report compromise. This case study uncovers user cybersecurity experiences and victimization trends and emphasizes the feasibility of using social media to enhance understandings of users from a security perspective. Overall, this dissertation presents an exploration into the complicated space of cyber identity. As new threats to security, user privacy, and information integrity continue to manifest, the need for reputation systems and techniques to evaluate and validate online identities will continue to grow
Efficient Learning of Communication Profiles from IP Flow Records
The task of network traffic monitoring has evolved drastically with the ever-increasing amount of data flowing in large scale networks. The automated analysis of this tremendous source of information often comes with using simpler models on aggregated data (e.g. IP flow records) due to time and space constraints. A step towards utilizing IP flow records more effectively are stream learning techniques. We propose a method to collect a limited yet relevant amount of data in order to learn a class of complex models, finite state machines, in real-time. These machines are used as communication profiles to fingerprint, identify or classify hosts and services and offer high detection rates while requiring less training data and thus being faster to compute than simple models.Accepted author manuscriptCyber Securit