1,531 research outputs found
Outlier Mining Methods Based on Graph Structure Analysis
Outlier detection in high-dimensional datasets is a fundamental and challenging problem across disciplines that has also practical implications, as removing outliers from the training set improves the performance of machine learning algorithms. While many outlier mining algorithms have been proposed in the literature, they tend to be valid or efficient for specific types of datasets (time series, images, videos, etc.). Here we propose two methods that can be applied to generic datasets, as long as there is a meaningful measure of distance between pairs of elements of the dataset. Both methods start by defining a graph, where the nodes are the elements of the dataset, and the links have associated weights that are the distances between the nodes. Then, the first method assigns an outlier score based on the percolation (i.e., the fragmentation) of the graph. The second method uses the popular IsoMap non-linear dimensionality reduction algorithm, and assigns an outlier score by comparing the geodesic distances with the distances in the reduced space. We test these algorithms on real and synthetic datasets and show that they either outperform, or perform on par with other popular outlier detection methods. A main advantage of the percolation method is that is parameter free and therefore, it does not require any training; on the other hand, the IsoMap method has two integer number parameters, and when they are appropriately selected, the method performs similar to or better than all the other methods tested.Peer ReviewedPostprint (published version
Hijacking Wireless Communications using WiFi Pineapple NANO as a Rogue Access Point
Wireless access points are an effective solution for building scalable, flexible, mobile networks. The problem with these access points is often the lack of security. Users regularly connect to wireless access points without thinking about whether they are genuine or malicious. Moreover, users are not aware of the types of attacks that can come from “rogue” access points set up by attackers and what information can be captured by them. Attackers use this advantage to gain access to users’ confidential information. The objective of this study is to examine the effectiveness of the WiFi Pineapple NANO used as a rogue access point (RAP) in tricking users to connect to it. As part of the preliminary study, a brief survey was provided to users who connected to the Pineapple to evaluate the reasons why users connect to RAPs. The result of the cybersecurity pilot study indicated that lack of awareness played an important role. Specifically, users unknowingly connect to rogue wireless access points that put at risk not only their devices, but the whole network. The information collected in this research could be used to better educate users on identifying possible RAPs and the dangers of connecting to them
Recommended from our members
Cloud-based cyber-physical intrusion detection for vehicles using Deep Learning
Detection of cyber attacks against vehicles is of growing interest. As vehicles typically afford limited processing resources, proposed solutions are rule-based or lightweight machine learning techniques. We argue that this limitation can be lifted with computational offloading commonly used for resource-constrained mobile devices. The increased processing resources available in this manner allow access to more advanced techniques. Using as case study a small four-wheel robotic land vehicle, we demonstrate the practicality and benefits of offloading the continuous task of intrusion detection that is based on deep learning. This approach achieves high accuracy much more consistently than with standard machine learning techniques and is not limited to a single type of attack or the in-vehicle CAN bus as previous work. As input, it uses data captured in real-time that relate to both cyber and physical processes, which it feeds as time series data to a neural network architecture. We use both a deep multilayer perceptron and a recurrent neural network architecture, with the latter benefitting from a long-short term memory hidden layer, which proves very useful for learning the temporal context of different attacks. We employ denial of service, command injection and malware as examples of cyber attacks that are meaningful for a robotic vehicle. The practicality of the latter depends on the resources afforded onboard and remotely, as well as the reliability of the communication means between them. Using detection latency as the criterion, we have developed a mathematical model to determine when computation offloading is beneficial given parameters related to the operation of the network and the processing demands of the deep learning model. The more reliable the network and the greater the processing demands, the greater the reduction in detection latency achieved through offloading
Deployment of Next Generation Intrusion Detection Systems against Internal Threats in a Medium-sized Enterprise
In this increasingly digital age, companies struggle to understand the origin of cyberattacks. Malicious actions can come from both the outside and the inside the business, so it is necessary to adopt tools that can reduce cyber risks by identifying the anomalies when the first symptoms appear.
This thesis deals with the topic of internal attacks and explains how to use innovative Intrusion Detection Systems to protect the IT infrastructure of Medium-sized Enterprises.
These types of technologies try to solve issues like poor visibility of network traffic, long response times to security breaches, and the use of inefficient access control mechanisms.
In this research, multiple types of internal threats, the different categories of Intrusion Detection Systems and an in-depth analysis of the state-of-the-art IDSs developed during the last few years have been detailed. After that, there will be a brief explanation of the effectiveness of IDSs in both testing and production environments.
All the reported phases took place within a company network, starting from the positioning of the IDS, moving on to its configuration and ending with the production environment.
There is an analysis of the company expectations, together with an explanation of the different IDSs characteristics.
This research shows data about potential attacks, mitigated and resolved threats, as well as network changes made thanks to the information gathered while using a cutting edge IDS.
Moreover, the characteristics that a medium-sized company must have in order to be adequately protected by a new generation IDS have been generalized. In the same way, the functionalities that an IDS must possess in order to achieve the set objectives were reported. IDSs are incredibly adaptable to different environments, such as companies of different sectors and sizes, and can be tuned to achieve better results.
At the end of this document are reported the potential future developments that should be addressed to improve IDS technologies further
Anomaly Detection In Blockchain
Anomaly detection has been a well-studied area for a long time. Its applications in the financial sector have aided in identifying suspicious activities of hackers. However, with the advancements in the financial domain such as blockchain and artificial intelligence, it is more challenging to deceive financial systems. Despite these technological advancements many fraudulent cases have still emerged.
Many artificial intelligence techniques have been proposed to deal with the anomaly detection problem; some results appear to be considerably assuring, but there is no explicit superior solution. This thesis leaps to bridge the gap between artificial intelligence and blockchain by pursuing various anomaly detection techniques on transactional network data of a public financial blockchain named 'Bitcoin'.
This thesis also presents an overview of the blockchain technology and its application in the financial sector in light of anomaly detection. Furthermore, it extracts the transactional data of bitcoin blockchain and analyses for malicious transactions using unsupervised machine learning techniques. A range of algorithms such as isolation forest, histogram based outlier detection (HBOS), cluster based local outlier factor (CBLOF), principal component analysis (PCA), K-means, deep autoencoder networks and ensemble method are evaluated and compared
Deteção de atividades ilícitas de software Bots através do DNS
DNS is a critical component of the Internet where almost all Internet applications
and organizations rely on. Its shutdown can deprive them from being part of the
Internet, and hence, DNS is usually the only protocol to be allowed when Internet
access is firewalled. The constant exposure of this protocol to external entities force
corporations to always be observant of external rogue software that may misuse
the DNS to establish covert channels and perform multiple illicit activities, such as
command and control and data exfiltration.
Most current solutions for bot malware and botnet detection are based on Deep
Packet Inspection techniques, such as analyzing DNS query payloads, which may
reveal private and sensitive information. In addiction, the majority of existing solutions
do not consider the usage of licit and encrypted DNS traffic, where Deep
Packet Inspection techniques are impossible to be used.
This dissertation proposes mechanisms to detect malware bots and botnet behaviors
on DNS traffic that are robust to encrypted DNS traffic and that ensure the
privacy of the involved entities by analyzing instead the behavioral patterns of DNS
communications using descriptive statistics over collected network metrics such as
packet rates, packet lengths, and silence and activity periods. After characterizing
DNS traffic behaviors, a study of the processed data is conducted, followed by the
training of Novelty Detection algorithms with the processed data.
Models are trained with licit data gathered from multiple licit activities, such as
reading the news, studying, and using social networks, in multiple operating systems,
browsers, and configurations. Then, the models were tested with similar
data, but containing bot malware traffic. Our tests show that our best performing
models achieve detection rates in the order of 99%, and 92% for malware bots
using low throughput rates.
This work ends with some ideas for a more realistic generation of bot malware
traffic, as the current DNS Tunneling tools are limited when mimicking licit DNS
usages, and for a better detection of malware bots that use low throughput rates.O DNS é um componente crítico da Internet, já que quase todas as aplicações
e organizações que a usam dependem dele para funcionar. A sua privação pode
deixá-las de fazerem parte da Internet, e por causa disso, o DNS é normalmente
o único protocolo permitido quando o acesso à Internet está restrito. A exposição
constante deste protocolo a entidades externas obrigam corporações a estarem
sempre atentas a software externo ilícito que pode fazer uso indevido do DNS para
estabelecer canais secretos e realizar várias atividades ilícitas, como comando e
controlo e exfiltração de dados.
A maioria das soluções atuais para detecção de malware bots e de botnets são
baseadas em técnicas inspeção profunda de pacotes, como analizar payloads de
pedidos de DNS, que podem revelar informação privada e sensitiva. Além disso,
a maioria das soluções existentes não consideram o uso lícito e cifrado de tráfego
DNS, onde técnicas como inspeção profunda de pacotes são impossíveis de serem
usadas.
Esta dissertação propõe mecanismos para detectar comportamentos de malware
bots e botnets que usam o DNS, que são robustos ao tráfego DNS cifrado e
que garantem a privacidade das entidades envolvidas ao analizar, em vez disso,
os padrões comportamentais das comunicações DNS usando estatística descritiva
em métricas recolhidas na rede, como taxas de pacotes, o tamanho dos pacotes,
e os tempos de atividade e silêncio. Após a caracterização dos comportamentos
do tráfego DNS, um estudo sobre os dados processados é realizado, sendo depois
usados para treinar os modelos de Detecção de Novidades.
Os modelos são treinados com dados lícitos recolhidos de multiplas atividades
lícitas, como ler as notícias, estudar, e usar redes sociais, em multiplos sistemas
operativos e com multiplas configurações. De seguida, os modelos são testados
com dados lícitos semelhantes, mas contendo também tráfego de malware bots.
Os nossos testes mostram que com modelos de Detecção de Novidades é possível
obter taxas de detecção na ordem dos 99%, e de 98% para malware bots que geram
pouco tráfego.
Este trabalho finaliza com algumas ideas para uma geração de tráfego ilícito mais
realista, já que as ferramentas atuais de DNS tunneling são limitadas quando
usadas para imitar usos de DNS lícito, e para uma melhor deteção de situações
onde malware bots geram pouco tráfego.Mestrado em Engenharia de Computadores e Telemátic
Machine learning methods for the characterization and classification of complex data
This thesis work presents novel methods for the analysis and classification of medical images and, more generally, complex data. First, an unsupervised machine learning method is proposed to order anterior chamber OCT (Optical Coherence Tomography) images according to a patient's risk of developing angle-closure glaucoma. In a second study, two outlier finding techniques are proposed to improve the results of above mentioned machine learning algorithm, we also show that they are applicable to a wide variety of data, including fraud detection in credit card transactions. In a third study, the topology of the vascular network of the retina, considering it a complex tree-like network is analyzed and we show that structural differences reveal the presence of glaucoma and diabetic retinopathy. In a fourth study we use a model of a laser with optical injection that presents extreme events in its intensity time-series to evaluate machine learning methods to forecast such extreme events.El presente trabajo de tesis desarrolla nuevos métodos para el análisis y clasificación de imágenes médicas y datos complejos en general. Primero, proponemos un método de aprendizaje automático sin supervisión que ordena imágenes OCT (tomografía de coherencia óptica) de la cámara anterior del ojo en función del grado de riesgo del paciente de padecer glaucoma de ángulo cerrado. Luego, desarrollamos dos métodos de detección automática de anomalías que utilizamos para mejorar los resultados del algoritmo anterior, pero que su aplicabilidad va mucho más allá, siendo útil, incluso, para la detección automática de fraudes en transacciones de tarjetas de crédito. Mostramos también, cómo al analizar la topología de la red vascular de la retina considerándola una red compleja, podemos detectar la presencia de glaucoma y de retinopatía diabética a través de diferencias estructurales. Estudiamos también un modelo de un láser con inyección óptica que presenta eventos extremos en la serie temporal de intensidad para evaluar diferentes métodos de aprendizaje automático para predecir dichos eventos extremos.Aquesta tesi desenvolupa nous mètodes per a l’anàlisi i la classificació d’imatges mèdiques i dades complexes. Hem proposat, primer, un mètode d’aprenentatge automàtic sense supervisió que ordena
imatges OCT (tomografia de coherència òptica) de la cambra anterior de l’ull en funció del grau de risc del pacient de patir glaucoma d’angle tancat. Després, hem desenvolupat dos mètodes de detecció automàtica d’anomalies que hem utilitzat per millorar els resultats de l’algoritme anterior, però que la seva aplicabilitat va molt més enllà, sent útil, fins i tot, per a la detecció automàtica de fraus en transaccions de targetes de crèdit. Mostrem també, com en analitzar la topologia de la xarxa vascular de la retina considerant-la una xarxa complexa, podem detectar la presència de glaucoma i de retinopatia diabètica a través de diferències estructurals. Finalment, hem estudiat un làser amb injecció òptica, el qual presenta esdeveniments extrems en la sèrie temporal d’intensitat.
Hem avaluat diferents mètodes per tal de predir-los.Postprint (published version
Multi-Source Data Fusion for Cyberattack Detection in Power Systems
Cyberattacks can cause a severe impact on power systems unless detected
early. However, accurate and timely detection in critical infrastructure
systems presents challenges, e.g., due to zero-day vulnerability exploitations
and the cyber-physical nature of the system coupled with the need for high
reliability and resilience of the physical system. Conventional rule-based and
anomaly-based intrusion detection system (IDS) tools are insufficient for
detecting zero-day cyber intrusions in the industrial control system (ICS)
networks. Hence, in this work, we show that fusing information from multiple
data sources can help identify cyber-induced incidents and reduce false
positives. Specifically, we present how to recognize and address the barriers
that can prevent the accurate use of multiple data sources for fusion-based
detection. We perform multi-source data fusion for training IDS in a
cyber-physical power system testbed where we collect cyber and physical side
data from multiple sensors emulating real-world data sources that would be
found in a utility and synthesizes these into features for algorithms to detect
intrusions. Results are presented using the proposed data fusion application to
infer False Data and Command injection-based Man-in- The-Middle (MiTM) attacks.
Post collection, the data fusion application uses time-synchronized merge and
extracts features followed by pre-processing such as imputation and encoding
before training supervised, semi-supervised, and unsupervised learning models
to evaluate the performance of the IDS. A major finding is the improvement of
detection accuracy by fusion of features from cyber, security, and physical
domains. Additionally, we observed the co-training technique performs at par
with supervised learning methods when fed with our features
- …