2,922 research outputs found
Automated Crowdturfing Attacks and Defenses in Online Review Systems
Malicious crowdsourcing forums are gaining traction as sources of spreading
misinformation online, but are limited by the costs of hiring and managing
human workers. In this paper, we identify a new class of attacks that leverage
deep learning language models (Recurrent Neural Networks or RNNs) to automate
the generation of fake online reviews for products and services. Not only are
these attacks cheap and therefore more scalable, but they can control rate of
content output to eliminate the signature burstiness that makes crowdsourced
campaigns easy to detect.
Using Yelp reviews as an example platform, we show how a two phased review
generation and customization attack can produce reviews that are
indistinguishable by state-of-the-art statistical detectors. We conduct a
survey-based user study to show these reviews not only evade human detection,
but also score high on "usefulness" metrics by users. Finally, we develop novel
automated defenses against these attacks, by leveraging the lossy
transformation introduced by the RNN training and generation cycle. We consider
countermeasures against our mechanisms, show that they produce unattractive
cost-benefit tradeoffs for attackers, and that they can be further curtailed by
simple constraints imposed by online service providers
Extractive Adversarial Networks: High-Recall Explanations for Identifying Personal Attacks in Social Media Posts
We introduce an adversarial method for producing high-recall explanations of
neural text classifier decisions. Building on an existing architecture for
extractive explanations via hard attention, we add an adversarial layer which
scans the residual of the attention for remaining predictive signal. Motivated
by the important domain of detecting personal attacks in social media comments,
we additionally demonstrate the importance of manually setting a semantically
appropriate `default' behavior for the model by explicitly manipulating its
bias term. We develop a validation set of human-annotated personal attacks to
evaluate the impact of these changes.Comment: Accepted to EMNLP 2018 Code and data available at
https://github.com/shcarton/rcn
Artificial intelligence in the cyber domain: Offense and defense
Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41
6G White Paper on Machine Learning in Wireless Communication Networks
The focus of this white paper is on machine learning (ML) in wireless
communications. 6G wireless communication networks will be the backbone of the
digital transformation of societies by providing ubiquitous, reliable, and
near-instant wireless connectivity for humans and machines. Recent advances in
ML research has led enable a wide range of novel technologies such as
self-driving vehicles and voice assistants. Such innovation is possible as a
result of the availability of advanced ML models, large datasets, and high
computational power. On the other hand, the ever-increasing demand for
connectivity will require a lot of innovation in 6G wireless networks, and ML
tools will play a major role in solving problems in the wireless domain. In
this paper, we provide an overview of the vision of how ML will impact the
wireless communication systems. We first give an overview of the ML methods
that have the highest potential to be used in wireless networks. Then, we
discuss the problems that can be solved by using ML in various layers of the
network such as the physical layer, medium access layer, and application layer.
Zero-touch optimization of wireless networks using ML is another interesting
aspect that is discussed in this paper. Finally, at the end of each section,
important research questions that the section aims to answer are presented
Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning
Deep neural networks are susceptible to various inference attacks as they
remember information about their training data. We design white-box inference
attacks to perform a comprehensive privacy analysis of deep learning models. We
measure the privacy leakage through parameters of fully trained models as well
as the parameter updates of models during training. We design inference
algorithms for both centralized and federated learning, with respect to passive
and active inference attackers, and assuming different adversary prior
knowledge.
We evaluate our novel white-box membership inference attacks against deep
learning algorithms to trace their training data records. We show that a
straightforward extension of the known black-box attacks to the white-box
setting (through analyzing the outputs of activation functions) is ineffective.
We therefore design new algorithms tailored to the white-box setting by
exploiting the privacy vulnerabilities of the stochastic gradient descent
algorithm, which is the algorithm used to train deep neural networks. We
investigate the reasons why deep learning models may leak information about
their training data. We then show that even well-generalized models are
significantly susceptible to white-box membership inference attacks, by
analyzing state-of-the-art pre-trained and publicly available models for the
CIFAR dataset. We also show how adversarial participants, in the federated
learning setting, can successfully run active membership inference attacks
against other participants, even when the global model achieves high prediction
accuracies.Comment: 2019 IEEE Symposium on Security and Privacy (SP
An Evasion Attack against ML-based Phishing URL Detectors
Background: Over the year, Machine Learning Phishing URL classification
(MLPU) systems have gained tremendous popularity to detect phishing URLs
proactively. Despite this vogue, the security vulnerabilities of MLPUs remain
mostly unknown. Aim: To address this concern, we conduct a study to understand
the test time security vulnerabilities of the state-of-the-art MLPU systems,
aiming at providing guidelines for the future development of these systems.
Method: In this paper, we propose an evasion attack framework against MLPU
systems. To achieve this, we first develop an algorithm to generate adversarial
phishing URLs. We then reproduce 41 MLPU systems and record their baseline
performance. Finally, we simulate an evasion attack to evaluate these MLPU
systems against our generated adversarial URLs. Results: In comparison to
previous works, our attack is: (i) effective as it evades all the models with
an average success rate of 66% and 85% for famous (such as Netflix, Google) and
less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively;
(ii) realistic as it requires only 23ms to produce a new adversarial URL
variant that is available for registration with a median cost of only
$11.99/year. We also found that popular online services such as Google
SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that
Adversarial training (successful defence against evasion attack) does not
significantly improve the robustness of these systems as it decreases the
success rate of our attack by only 6% on average for all the models. (iv)
Further, we identify the security vulnerabilities of the considered MLPU
systems. Our findings lead to promising directions for future research.
Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but
also highlights implications for future study towards assessing and improving
these systems.Comment: Draft for ACM TOP
PassGAN: A Deep Learning Approach for Password Guessing
State-of-the-art password guessing tools, such as HashCat and John the
Ripper, enable users to check billions of passwords per second against password
hashes. In addition to performing straightforward dictionary attacks, these
tools can expand password dictionaries using password generation rules, such as
concatenation of words (e.g., "password123456") and leet speak (e.g.,
"password" becomes "p4s5w0rd"). Although these rules work well in practice,
expanding them to model further passwords is a laborious task that requires
specialized expertise. To address this issue, in this paper we introduce
PassGAN, a novel approach that replaces human-generated password rules with
theory-grounded machine learning algorithms. Instead of relying on manual
password analysis, PassGAN uses a Generative Adversarial Network (GAN) to
autonomously learn the distribution of real passwords from actual password
leaks, and to generate high-quality password guesses. Our experiments show that
this approach is very promising. When we evaluated PassGAN on two large
password datasets, we were able to surpass rule-based and state-of-the-art
machine learning password guessing tools. However, in contrast with the other
tools, PassGAN achieved this result without any a-priori knowledge on passwords
or common password structures. Additionally, when we combined the output of
PassGAN with the output of HashCat, we were able to match 51%-73% more
passwords than with HashCat alone. This is remarkable, because it shows that
PassGAN can autonomously extract a considerable number of password properties
that current state-of-the art rules do not encode.Comment: This is an extended version of the paper which appeared in NeurIPS
2018 Workshop on Security in Machine Learning (SecML'18), see
https://github.com/secml2018/secml2018.github.io/raw/master/PASSGAN_SECML2018.pd
- …