604 research outputs found
HF-SCA: Hands-Free Strong Customer Authentication Based on a Memory-Guided Attention Mechanisms
Strong customer authentication (SCA) is a requirement of the European Union Revised Directive on Payment Services (PSD2) which ensures that electronic payments are performed with multifactor authentication. While increasing the security of electronic payments, the SCA impacted seriously on the shopping carts abandonment: an Italian bank computed that 22% of online purchases in the first semester of 2021 did not complete because of problems with the SCA. Luckily, the PSD2 allows the use of transaction risk analysis tool to exempt the SCA process. In this paper, we propose an unsupervised novel combination of existing machine learning techniques able to determine if a purchase is typical or not for a specific customer, so that in the case of a typical purchase the SCA could be exempted. We modified a well-known architecture (U-net) by replacing convolutional blocks with squeeze-and-excitation blocks. After that, a memory network was added in a latent space and an attention mechanism was introduced in the decoding side of the network. The proposed solution was able to detect nontypical purchases by creating temporal correlations between transactions. The network achieved 97.7% of AUC score over a well-known dataset retrieved online. By using this approach, we found that 98% of purchases could be executed by securely exempting the SCA, while shortening the customer’s journey and providing an elevated user experience. As an additional validation, we developed an Alexa skill for Amazon smart glasses which allows a user to shop and pay online by merely using vocal interaction, leaving the hands free to perform other activities, for example driving a car
Your Smart Home Can't Keep a Secret: Towards Automated Fingerprinting of IoT Traffic with Neural Networks
The IoT (Internet of Things) technology has been widely adopted in recent
years and has profoundly changed the people's daily lives. However, in the
meantime, such a fast-growing technology has also introduced new privacy
issues, which need to be better understood and measured. In this work, we look
into how private information can be leaked from network traffic generated in
the smart home network. Although researchers have proposed techniques to infer
IoT device types or user behaviors under clean experiment setup, the
effectiveness of such approaches become questionable in the complex but
realistic network environment, where common techniques like Network Address and
Port Translation (NAPT) and Virtual Private Network (VPN) are enabled. Traffic
analysis using traditional methods (e.g., through classical machine-learning
models) is much less effective under those settings, as the features picked
manually are not distinctive any more. In this work, we propose a traffic
analysis framework based on sequence-learning techniques like LSTM and
leveraged the temporal relations between packets for the attack of device
identification. We evaluated it under different environment settings (e.g.,
pure-IoT and noisy environment with multiple non-IoT devices). The results
showed our framework was able to differentiate device types with a high
accuracy. This result suggests IoT network communications pose prominent
challenges to users' privacy, even when they are protected by encryption and
morphed by the network gateway. As such, new privacy protection methods on IoT
traffic need to be developed towards mitigating this new issue
On the subspace learning for network attack detection
Tese (doutorado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Elétrica, 2019.O custo com todos os tipos de ciberataques tem crescido nas organizações. A casa branca do
goveno norte americano estima que atividades cibernéticas maliciosas custaram em 2016 um
valor entre US109 bilhões para a economia norte americana. Recentemente, é
possível observar um crescimento no número de ataques de negação de serviço, botnets,
invasões e ransomware.
A Accenture argumenta que 89% dos entrevistados em uma pesquisa acreditam que tecnologias
como inteligência artificial, aprendizagem de máquina e análise baseada em comportamentos,
são essenciais para a segurança das organizações. É possível adotar abordagens semisupervisionada e não-supervisionadas para implementar análises baseadas em
comportamentos, que podem ser aplicadas na detecção de anomalias em tráfego de rede, sem a
ncessidade de dados de ataques para treinamento.
Esquemas de processamento de sinais têm sido aplicados na detecção de tráfegos maliciosos
em redes de computadores, através de abordagens não-supervisionadas que mostram ganhos
na detecção de ataques de rede e na detecção e anomalias.
A detecção de anomalias pode ser desafiadora em cenários de dados desbalanceados, que são
casos com raras ocorrências de anomalias em comparação com o número de eventos normais.
O desbalanceamento entre classes pode comprometer o desempenho de algoritmos traficionais
de classificação, através de um viés para a classe predominante, motivando o desenvolvimento
de algoritmos para detecção de anomalias em dados desbalanceados.
Alguns algoritmos amplamente utilizados na detecção de anomalias assumem que observações
legítimas seguem uma distribuição Gaussiana. Entretanto, esta suposição pode não ser
observada na análise de tráfego de rede, que tem suas variáveis usualmente caracterizadas por
distribuições assimétricas
ou de cauda pesada. Desta forma, algoritmos de detecção de anomalias têm atraído pesquisas
para se tornarem mais discriminativos em distribuições assimétricas, como também para se
tornarem mais robustos à corrupção e capazes de lidar com problemas causados pelo
desbalanceamento de dados.
Como uma primeira contribuição, foi proposta a Autosimilaridade (Eigensimilarity em inglês), que
é uma abordagem baseada em conceitos de processamento de sinais com o objetivo de detectar
tráfego malicioso em redes de computadores. Foi avaliada a acurácia e o desempenho da
abordagem proposta através de cenários simulados e dos dados do DARPA 1998. Os
experimentos mostram que Autosimilaridade detecta os ataques synflood, fraggle e varredura de
portas com precisão, com detalhes e de uma forma automática e cega, i.e. em uma abordagem
não-supervisionada.
Considerando que a assimetria de distribuições de dados podem melhorar a detecção de
anomalias em dados desbalanceados e assimétricos, como no caso de tráfego de rede, foi
proposta a Análise Robusta de Componentes Principais baseada em Momentos (ARCP-m), que
é uma abordagem baseada em distâncias entre observações contaminadas e momentos
calculados a partir subespaços robustos aprendidos através da Análise Robusta de
Componentes Principais (ARCP), com o objetivo de detectar anomalias em dados assimétricos e
em tráfego de rede.
Foi avaliada a acurácia do ARCP-m para detecção de anomalias em dados simulados, com
distribuições assimétricas e de cauda pesada, como também para os dados do CTU-13. Os
experimentos comparam nossa proposta com algoritmos amplamente utilizados para detecção
de anomalias e mostra que a distância entre estimativas robustas e observações contaminadas
pode melhorar a detecção de anomalias em dados assimétricos e a detecção de ataques de
rede.
Adicionalmente, foi proposta uma arquitetura e abordagem para avaliar uma prova de conceito
da Autosimilaridade para a detecção de comportamentos maliciosos em aplicações móveis
corporativas. Neste sentido, foram propostos cenários, variáveis e abordagem para a análise de
ameaças, como também foi avaliado o tempo de processamento necessário para a execução do
Autosimilaridade em dispositivos móveis.The cost of all types of cyberattacks is increasing for global organizations. The Whitehouse of the
U.S. government estimates that malicious cyber activity cost the U.S. economy between US109 billion in 2016. Recently, it is possible to observe an increasing in numbers of
Denial of Service (DoS), botnets, malicious insider and ransomware attacks.
Accenture consulting argues that 89% of survey respondents believe breakthrough technologies,
like artificial intelligence, machine learning and user behavior analytics, are essential for securing
their organizations. To face adversarial models, novel network attacks and counter measures of
attackers to avoid detection, it is possible to adopt unsupervised or semi-supervised approaches
for network anomaly detection, by means of behavioral analysis, where known anomalies are not
necessaries for training models.
Signal processing schemes have been applied to detect malicious traffic in computer networks
through unsupervised approaches, showing advances in network traffic analysis, in network
attack detection, and in network intrusion detection systems.
Anomalies can be hard to identify and separate from normal data due to the rare occurrences of
anomalies in comparison to normal events. The imbalanced data can compromise the
performance of most standard learning algorithms, creating bias or unfair weight to learn from the
majority class and reducing detection capacity of anomalies that are characterized by the minority
class. Therefore, anomaly detection algorithms have to be highly discriminating, robust to
corruption and able to deal with the imbalanced data problem.
Some widely adopted algorithms for anomaly detection assume a Gaussian distributed data for
legitimate observations, however this assumption may not be observed in network traffic, which is
usually characterized by skewed and heavy-tailed distributions.
As a first important contribution, we propose the Eigensimilarity, which is an approach based on
signal processing concepts applied to detection of malicious traffic in computer networks. We
evaluate the accuracy and performance of the proposed framework applied to a simulated
scenario and to the DARPA 1998 data set. The performed experiments show that synflood,
fraggle and port scan attacks can be detected accurately by Eigensimilarity and with great detail,
in an automatic and blind fashion, i.e. in an unsupervised approach.
Considering that the skewness improves anomaly detection in imbalanced and skewed data,
such as network traffic, we propose the Moment-based Robust Principal Component Analysis (mRPCA) for network attack detection. The m-RPCA is a framework based on distances between
contaminated observations and moments computed from a robust subspace learned by Robust
Principal Component Analysis (RPCA), in order to detect anomalies from skewed data and
network traffic. We evaluate the accuracy of the m-RPCA for anomaly detection on simulated
data sets, with skewed and heavy-tailed distributions, and for the CTU-13 data set. The
Experimental evaluation compares our proposal to widely adopted algorithms for anomaly
detection and shows that the distance between robust estimates and contaminated observations
can improve the anomaly detection on skewed data and the network attack detection.
Moreover, we propose an architecture and approach to evaluate a proof of concept of
Eigensimilarity for malicious behavior detection on mobile applications, in order to detect possible
threats in offline corporate mobile client. We propose scenarios, features and approaches for
threat analysis by means of Eigensimilarity, and evaluate the processing time required for
Eigensimilarity execution in mobile devices
Fault diagnosis for IP-based network with real-time conditions
BACKGROUND:
Fault diagnosis techniques have been based on many paradigms, which derive from diverse areas
and have different purposes: obtaining a representation model of the network for fault localization,
selecting optimal probe sets for monitoring network devices, reducing fault detection time, and
detecting faulty components in the network. Although there are several solutions for diagnosing
network faults, there are still challenges to be faced: a fault diagnosis solution needs to always be
available and able enough to process data timely, because stale results inhibit the quality and speed
of informed decision-making. Also, there is no non-invasive technique to continuously diagnose the
network symptoms without leaving the system vulnerable to any failures, nor a resilient technique
to the network's dynamic changes, which can cause new failures with different symptoms.
AIMS:
This thesis aims to propose a model for the continuous and timely diagnosis of IP-based networks
faults, independent of the network structure, and based on data analytics techniques.
METHOD(S):
This research's point of departure was the hypothesis of a fault propagation phenomenon that
allows the observation of failure symptoms at a higher network level than the fault origin. Thus, for
the model's construction, monitoring data was collected from an extensive campus network in
which impact link failures were induced at different instants of time and with different duration.
These data correspond to widely used parameters in the actual management of a network. The
collected data allowed us to understand the faults' behavior and how they are manifested at a
peripheral level.
Based on this understanding and a data analytics process, the first three modules of our model,
named PALADIN, were proposed (Identify, Collection and Structuring), which define the data
collection peripherally and the necessary data pre-processing to obtain the description of the
network's state at a given moment. These modules give the model the ability to structure the data
considering the delays of the multiple responses that the network delivers to a single monitoring
probe and the multiple network interfaces that a peripheral device may have.
Thus, a structured data stream is obtained, and it is ready to be analyzed. For this analysis, it was
necessary to implement an incremental learning framework that respects networks' dynamic
nature. It comprises three elements, an incremental learning algorithm, a data rebalancing strategy,
and a concept drift detector. This framework is the fourth module of the PALADIN model named
Diagnosis.
In order to evaluate the PALADIN model, the Diagnosis module was implemented with 25 different
incremental algorithms, ADWIN as concept-drift detector and SMOTE (adapted to streaming scenario) as the rebalancing strategy. On the other hand, a dataset was built through the first
modules of the PALADIN model (SOFI dataset), which means that these data are the incoming data
stream of the Diagnosis module used to evaluate its performance.
The PALADIN Diagnosis module performs an online classification of network failures, so it is a
learning model that must be evaluated in a stream context. Prequential evaluation is the most used
method to perform this task, so we adopt this process to evaluate the model's performance over
time through several stream evaluation metrics.
RESULTS:
This research first evidences the phenomenon of impact fault propagation, making it possible to
detect fault symptoms at a monitored network's peripheral level. It translates into non-invasive
monitoring of the network. Second, the PALADIN model is the major contribution in the fault
detection context because it covers two aspects. An online learning model to continuously process
the network symptoms and detect internal failures. Moreover, the concept-drift detection and
rebalance data stream components which make resilience to dynamic network changes possible.
Third, it is well known that the amount of available real-world datasets for imbalanced stream
classification context is still too small. That number is further reduced for the networking context.
The SOFI dataset obtained with the first modules of the PALADIN model contributes to that number
and encourages works related to unbalanced data streams and those related to network fault
diagnosis.
CONCLUSIONS:
The proposed model contains the necessary elements for the continuous and timely diagnosis of IPbased
network faults; it introduces the idea of periodical monitorization of peripheral network
elements and uses data analytics techniques to process it. Based on the analysis, processing, and
classification of peripherally collected data, it can be concluded that PALADIN achieves the
objective. The results indicate that the peripheral monitorization allows diagnosing faults in the
internal network; besides, the diagnosis process needs an incremental learning process, conceptdrift
detection elements, and rebalancing strategy.
The results of the experiments showed that PALADIN makes it possible to learn from the network
manifestations and diagnose internal network failures. The latter was verified with 25 different
incremental algorithms, ADWIN as concept-drift detector and SMOTE (adapted to streaming
scenario) as the rebalancing strategy.
This research clearly illustrates that it is unnecessary to monitor all the internal network elements
to detect a network's failures; instead, it is enough to choose the peripheral elements to be
monitored. Furthermore, with proper processing of the collected status and traffic descriptors, it is
possible to learn from the arriving data using incremental learning in cooperation with data
rebalancing and concept drift approaches. This proposal continuously diagnoses the network
symptoms without leaving the system vulnerable to failures while being resilient to the network's
dynamic changes.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: José Manuel Molina López.- Secretario: Juan Carlos Dueñas López.- Vocal: Juan Manuel Corchado Rodrígue
Network intrusion detection system for DDoS attacks in ICS using deep autoencoders
Anomaly detection in industrial control and cyber-physical systems has gained much attention over the past years due to the increasing modernisation and exposure of industrial environments. Current dangers to the connected industry include the theft of industrial intellectual property, denial of service, or the compromise of cloud components; all of which might result in a cyber-attack across the operational network. However, most scientific work employs device logs, which necessitate substantial understanding and preprocessing before they can be used in anomaly detection. In this paper, we propose a network intrusion detection system (NIDS) architecture based on a deep autoencoder trained on network flow data, which has the advantage of not requiring prior knowledge of the network topology or its underlying architecture. Experimental results show that the proposed model can detect anomalies, caused by distributed denial of service attacks, providing a high detection rate and low false alarms, outperforming the state-of-the-art and a baseline model in an unsupervised learning environment. Furthermore, the deep autoencoder model can detect abnormal behaviour in legitimate devices after an attack. We also demonstrate the suitability of the proposed NIDS in a real industrial plant from the alimentary sector, analysing the false positive rate and the viability of the data generation, filtering and preprocessing procedure for a near real time scenario. The suggested NIDS architecture is a low-cost solution that uses only fifteen network-based features, requires minimal processing, operates in unsupervised mode, and is straightforward to deploy in real-world scenarios.Axencia Galega de Innovación | Ref. IN854A 2019/15Centro para el Desarrollo Tecnológico Industrial | Ref. CER-20191012Agencia Estatal de Investigación | Ref. MTM2017-89422-PFinanciado para publicación en acceso aberto: Universidade de Vigo/CISU
- …