105 research outputs found
An Evasion Attack against ML-based Phishing URL Detectors
Background: Over the year, Machine Learning Phishing URL classification
(MLPU) systems have gained tremendous popularity to detect phishing URLs
proactively. Despite this vogue, the security vulnerabilities of MLPUs remain
mostly unknown. Aim: To address this concern, we conduct a study to understand
the test time security vulnerabilities of the state-of-the-art MLPU systems,
aiming at providing guidelines for the future development of these systems.
Method: In this paper, we propose an evasion attack framework against MLPU
systems. To achieve this, we first develop an algorithm to generate adversarial
phishing URLs. We then reproduce 41 MLPU systems and record their baseline
performance. Finally, we simulate an evasion attack to evaluate these MLPU
systems against our generated adversarial URLs. Results: In comparison to
previous works, our attack is: (i) effective as it evades all the models with
an average success rate of 66% and 85% for famous (such as Netflix, Google) and
less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively;
(ii) realistic as it requires only 23ms to produce a new adversarial URL
variant that is available for registration with a median cost of only
$11.99/year. We also found that popular online services such as Google
SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that
Adversarial training (successful defence against evasion attack) does not
significantly improve the robustness of these systems as it decreases the
success rate of our attack by only 6% on average for all the models. (iv)
Further, we identify the security vulnerabilities of the considered MLPU
systems. Our findings lead to promising directions for future research.
Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but
also highlights implications for future study towards assessing and improving
these systems.Comment: Draft for ACM TOP
Attacking logo-based phishing website detectors with adversarial perturbations
Recent times have witnessed the rise of anti-phishing schemes powered by deep
learning (DL). In particular, logo-based phishing detectors rely on DL models
from Computer Vision to identify logos of well-known brands on webpages, to
detect malicious webpages that imitate a given brand. For instance, Siamese
networks have demonstrated notable performance for these tasks, enabling the
corresponding anti-phishing solutions to detect even "zero-day" phishing
webpages. In this work, we take the next step of studying the robustness of
logo-based phishing detectors against adversarial ML attacks. We propose a
novel attack exploiting generative adversarial perturbations to craft
"adversarial logos" that evade phishing detectors. We evaluate our attacks
through: (i) experiments on datasets containing real logos, to evaluate the
robustness of state-of-the-art phishing detectors; and (ii) user studies to
gauge whether our adversarial logos can deceive human eyes. The results show
that our proposed attack is capable of crafting perturbed logos subtle enough
to evade various DL models-achieving an evasion rate of up to 95%. Moreover,
users are not able to spot significant differences between generated
adversarial logos and original ones.Comment: To appear in ESORICS 202
Mitigating Adversarial Gray-Box Attacks Against Phishing Detectors
Although machine learning based algorithms have been extensively used for
detecting phishing websites, there has been relatively little work on how
adversaries may attack such "phishing detectors" (PDs for short). In this
paper, we propose a set of Gray-Box attacks on PDs that an adversary may use
which vary depending on the knowledge that he has about the PD. We show that
these attacks severely degrade the effectiveness of several existing PDs. We
then propose the concept of operation chains that iteratively map an original
set of features to a new set of features and develop the "Protective Operation
Chain" (POC for short) algorithm. POC leverages the combination of random
feature selection and feature mappings in order to increase the attacker's
uncertainty about the target PD. Using 3 existing publicly available datasets
plus a fourth that we have created and will release upon the publication of
this paper, we show that POC is more robust to these attacks than past
competing work, while preserving predictive performance when no adversarial
attacks are present. Moreover, POC is robust to attacks on 13 different
classifiers, not just one. These results are shown to be statistically
significant at the p < 0.001 level
The Threat of Offensive AI to Organizations
AI has provided us with the ability to automate tasks, extract information from vast amounts of data, and synthesize media that is nearly indistinguishable from the real thing. However, positive tools can also be used for negative purposes. In particular, cyber adversaries can use AI to enhance their attacks and expand their campaigns.
Although offensive AI has been discussed in the past, there is a need to analyze and understand the threat in the context of organizations. For example, how does an AI-capable adversary impact the cyber kill chain? Does AI benefit the attacker more than the defender? What are the most significant AI threats facing organizations today and what will be their impact on the future?
In this study, we explore the threat of offensive AI on organizations. First, we present the background and discuss how AI changes the adversary’s methods, strategies, goals, and overall attack model. Then, through a literature review, we identify 32 offensive AI capabilities which adversaries can use to enhance their attacks. Finally, through a panel survey spanning industry, government and academia, we rank the AI threats and provide insights on the adversaries
Impacts and Risk of Generative AI Technology on Cyber Defense
Generative Artificial Intelligence (GenAI) has emerged as a powerful
technology capable of autonomously producing highly realistic content in
various domains, such as text, images, audio, and videos. With its potential
for positive applications in creative arts, content generation, virtual
assistants, and data synthesis, GenAI has garnered significant attention and
adoption. However, the increasing adoption of GenAI raises concerns about its
potential misuse for crafting convincing phishing emails, generating
disinformation through deepfake videos, and spreading misinformation via
authentic-looking social media posts, posing a new set of challenges and risks
in the realm of cybersecurity. To combat the threats posed by GenAI, we propose
leveraging the Cyber Kill Chain (CKC) to understand the lifecycle of
cyberattacks, as a foundational model for cyber defense. This paper aims to
provide a comprehensive analysis of the risk areas introduced by the offensive
use of GenAI techniques in each phase of the CKC framework. We also analyze the
strategies employed by threat actors and examine their utilization throughout
different phases of the CKC, highlighting the implications for cyber defense.
Additionally, we propose GenAI-enabled defense strategies that are both
attack-aware and adaptive. These strategies encompass various techniques such
as detection, deception, and adversarial training, among others, aiming to
effectively mitigate the risks posed by GenAI-induced cyber threats
CharBot: A Simple and Effective Method for Evading DGA Classifiers
Domain generation algorithms (DGAs) are commonly leveraged by malware to
create lists of domain names which can be used for command and control (C&C)
purposes. Approaches based on machine learning have recently been developed to
automatically detect generated domain names in real-time. In this work, we
present a novel DGA called CharBot which is capable of producing large numbers
of unregistered domain names that are not detected by state-of-the-art
classifiers for real-time detection of DGAs, including the recently published
methods FANCI (a random forest based on human-engineered features) and LSTM.MI
(a deep learning approach). CharBot is very simple, effective and requires no
knowledge of the targeted DGA classifiers. We show that retraining the
classifiers on CharBot samples is not a viable defense strategy. We believe
these findings show that DGA classifiers are inherently vulnerable to
adversarial attacks if they rely only on the domain name string to make a
decision. Designing a robust DGA classifier may, therefore, necessitate the use
of additional information besides the domain name alone. To the best of our
knowledge, CharBot is the simplest and most efficient black-box adversarial
attack against DGA classifiers proposed to date
A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection
Many challenging real-world problems require the deployment of ensembles
multiple complementary learning models to reach acceptable performance levels.
While effective, applying the entire ensemble to every sample is costly and
often unnecessary. Deep Reinforcement Learning (DRL) offers a cost-effective
alternative, where detectors are dynamically chosen based on the output of
their predecessors, with their usefulness weighted against their computational
cost. Despite their potential, DRL-based solutions are not widely used in this
capacity, partly due to the difficulties in configuring the reward function for
each new task, the unpredictable reactions of the DRL agent to changes in the
data, and the inability to use common performance metrics (e.g., TPR/FPR) to
guide the algorithm's performance. In this study we propose methods for
fine-tuning and calibrating DRL-based policies so that they can meet multiple
performance goals. Moreover, we present a method for transferring effective
security policies from one dataset to another. Finally, we demonstrate that our
approach is highly robust against adversarial attacks
- …