Search CORE

19 research outputs found

FENCE: Feasible Evasion Attacks on Neural Networks in Constrained Environments

Author: Chernikova Alesia
Oprea Alina
Publication venue
Publication date: 15/09/2020
Field of study

As advances in Deep Neural Networks (DNNs) demonstrate unprecedented levels of performance in many critical applications, their vulnerability to attacks is still an open question. We consider evasion attacks at the testing time against Deep Learning in constrained environments, in which dependencies between features need to be satisfied. These situations may arise naturally in tabular data or may be the result of feature engineering in specific application domains, such as threat detection. We propose a general iterative gradient-based framework called FENCE for crafting evasion attacks that take into consideration the specifics of constrained domains. We apply it against Feed-Forward Neural Networks in two threat detection applications: network traffic botnet classification and malicious domain classification, to generate feasible adversarial examples. We extensively evaluate the success rate and performance of our attacks, compare their significant improvement over several baselines, and analyze several factors that impact the attack success rate, including the optimization objective and the data imbalance. We show that with minimal effort (e.g., generating 12 additional network connections), an attacker can change the model's prediction to the target one. We found that models trained on datasets with higher imbalance are more vulnerable to our FENCE attacks. Finally, we show the potential of adversarial training in constrained domains to increase the DNN resilience against these attacks

arXiv.org e-Print Archive

Trust, But Verify: A Survey of Randomized Smoothing Techniques

Author: Bhardwaj Devansh
Gupta Sarthak
Jindal Sukrit
Kumari Anupriya
Publication venue
Publication date: 19/12/2023
Field of study

Machine learning models have demonstrated remarkable success across diverse domains but remain vulnerable to adversarial attacks. Empirical defence mechanisms often fall short, as new attacks constantly emerge, rendering existing defences obsolete. A paradigm shift from empirical defences to certification-based defences has been observed in response. Randomized smoothing has emerged as a promising technique among notable advancements. This study reviews the theoretical foundations, empirical effectiveness, and applications of randomized smoothing in verifying machine learning classifiers. We provide an in-depth exploration of the fundamental concepts underlying randomized smoothing, highlighting its theoretical guarantees in certifying robustness against adversarial perturbations. Additionally, we discuss the challenges of existing methodologies and offer insightful perspectives on potential solutions. This paper is novel in its attempt to systemise the existing knowledge in the context of randomized smoothing

arXiv.org e-Print Archive

Towards understanding the robustness against evasion attack on categorical inputs

Author: Bao Hongyan
Han Yufei
Shen Yun
Zhang Xiangliang
Zhou Yujun
Publication venue: HAL CCSD
Publication date: 25/04/2022
Field of study

International audienceCharacterizing and assessing the adversarial risk of a classifier with categorical inputs has been a practically important yet rarely explored research problem. Conventional wisdom attributes the difficulty of solving the problem to its combinatorial nature. Previous research efforts tackling this problem are specific to use cases and heavily depend on domain knowledge. Such limitations prevent their general applicability in real-world applications with categorical data. Our study novelly shows that provably optimal adversarial robustness assessment is computationally feasible for any classifier with a mild smoothness constraint. We theoretically analyze the impact factors of adversarial vulnerability of a classifier with categorical inputs via an information-theoretic adversarial risk analysis. Corroborating these theoretical findings with a substantial experimental study over various real-world categorical datasets, we can empirically assess the impact of the key adversarial risk factors over a targeted learning system with categorical inputs

INRIA a CCSD electronic archive server

Intriguing Properties of Adversarial ML Attacks in the Problem Space

Author: Cavallaro Lorenzo
Cortellazzi Jacopo
Pendlebury Feargus
Pierazzi Fabio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/03/2020
Field of study

Recent research efforts on adversarial ML have investigated problem-space attacks, focusing on the generation of real evasive objects in domains where, unlike images, there is no clear inverse mapping to the feature space (e.g., software). However, the design, comparison, and real-world implications of problem-space attacks remain underexplored. This paper makes two major contributions. First, we propose a novel formalization for adversarial ML evasion attacks in the problem-space, which includes the definition of a comprehensive set of constraints on available transformations, preserved semantics, robustness to preprocessing, and plausibility. We shed light on the relationship between feature space and problem space, and we introduce the concept of side-effect features as the byproduct of the inverse feature-mapping problem. This enables us to define and prove necessary and sufficient conditions for the existence of problem-space attacks. We further demonstrate the expressive power of our formalization by using it to describe several attacks from related literature across different domains. Second, building on our formalization, we propose a novel problem-space attack on Android malware that overcomes past limitations. Experiments on a dataset with 170K Android apps from 2017 and 2018 show the practical feasibility of evading a state-of-the-art malware classifier along with its hardened version. Our results demonstrate that "adversarial-malware as a service" is a realistic threat, as we automatically generate thousands of realistic and inconspicuous adversarial applications at scale, where on average it takes only a few minutes to generate an adversarial app. Our formalization of problem-space attacks paves the way to more principled research in this domain.Comment: This arXiv version (v2) corresponds to the one published at IEEE Symposium on Security & Privacy (Oakland), 202

arXiv.org e-Print Archive

Royal Holloway - Pure

UCL Discovery

King's Research Portal

Privacy and security in cyber-physical systems

Author: Erdemir Ecenaz
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/03/2022
Field of study

Data privacy has attracted increasing attention in the past decade due to the emerging technologies that require our data to provide utility. Service providers (SPs) encourage users to share their personal data in return for a better user experience. However, users' raw data usually contains implicit sensitive information that can be inferred by a third party. This raises great concern about users' privacy. In this dissertation, we develop novel techniques to achieve a better privacy-utility trade-off (PUT) in various applications. We first consider smart meter (SM) privacy and employ physical resources to minimize the information leakage to the SP through SM readings. We measure privacy using information-theoretic metrics and find private data release policies (PDRPs) by formulating the problem as a Markov decision process (MDP). We also propose noise injection techniques for time-series data privacy. We characterize optimal PDRPs measuring privacy via mutual information (MI) and utility loss via added distortion. Reformulating the problem as an MDP, we solve it using deep reinforcement learning (DRL) for real location trace data. We also consider a scenario for hiding an underlying ``sensitive'' variable and revealing a ``useful'' variable for utility by periodically selecting from among sensors to share the measurements with an SP. We formulate this as an optimal stopping problem and solve using DRL. We then consider privacy-aware communication over a wiretap channel. We maximize the information delivered to the legitimate receiver, while minimizing the information leakage from the sensitive attribute to the eavesdropper. We propose using a variational-autoencoder (VAE) and validate our approach with colored and annotated MNIST dataset. Finally, we consider defenses against active adversaries in the context of security-critical applications. We propose an adversarial example (AE) generation method exploiting the data distribution. We perform adversarial training using the proposed AEs and evaluate the performance against real-world adversarial attacks.Open Acces

Spiral - Imperial College Digital Repository

Towards private and robust machine learning for information security

Author: Hayes Jamie
Publication venue: UCL (University College London)
Publication date: 28/12/2021
Field of study

Many problems in information security are pattern recognition problems. For example, determining if a digital communication can be trusted amounts to certifying that the communication does not carry malicious or secret content, which can be distilled into the problem of recognising the difference between benign and malicious content. At a high level, machine learning is the study of how patterns are formed within data, and how learning these patterns generalises beyond the potentially limited data pool at a practitioner’s disposal, and so has become a powerful tool in information security. In this work, we study the benefits machine learning can bring to two problems in information security. Firstly, we show that machine learning can be used to detect which websites are visited by an internet user over an encrypted connection. By analysing timing and packet size information of encrypted network traffic, we train a machine learning model that predicts the target website given a stream of encrypted network traffic, even if browsing is performed over an anonymous communication network. Secondly, in addition to studying how machine learning can be used to design attacks, we study how it can be used to solve the problem of hiding information within a cover medium, such as an image or an audio recording, which is commonly referred to as steganography. How well an algorithm can hide information within a cover medium amounts to how well the algorithm models and exploits areas of redundancy. This can again be reduced to a pattern recognition problem, and so we apply machine learning to design a steganographic algorithm that efficiently hides a secret message with an image. Following this, we proceed with discussions surrounding why machine learning is not a panacea for information security, and can be an attack vector in and of itself. We show that machine learning can leak private and sensitive information about the data it used to learn, and how malicious actors can exploit vulnerabilities in these learning algorithms to compel them to exhibit adversarial behaviours. Finally, we examine the problem of the disconnect between image recognition systems learned by humans and by machine learning models. While human classification of an image is relatively robust to noise, machine learning models do not possess this property. We show how an attacker can cause targeted misclassifications against an entire data distribution by exploiting this property, and go onto introduce a mitigation that ameliorates this undesirable trait of machine learning

UCL Discovery

Health privacy : methods for privacy-preserving data sharing of methylation, microbiome and eye tracking data

Author: Hagestedt Inken
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2021
Field of study

This thesis studies the privacy risks of biomedical data and develops mechanisms for privacy-preserving data sharing. The contribution of this work is two-fold: First, we demonstrate privacy risks of a variety of biomedical data types such as DNA methylation data, microbiome data and eye tracking data. Despite being less stable than well-studied genome data and more prone to environmental changes, well-known privacy attacks can be adopted and threaten the privacy of data donors. Nevertheless, data sharing is crucial to advance biomedical research given that collection the data of a sufficiently large population is complex and costly. Therefore, we develop as a second step privacy- preserving tools that enable researchers to share such biomedical data. and second, we equip researchers with tools to enable privacy-preserving data sharing. These tools are mostly based on differential privacy, machine learning techniques and adversarial examples and carefully tuned to the concrete use case to maintain data utility while preserving privacy.Diese Dissertation beleuchtet Risiken für die Privatsphäre von biomedizinischen Daten und entwickelt Mechanismen für privatsphäre-erthaltendes Teilen von Daten. Dies zerfällt in zwei Teile: Zunächst zeigen wir die Risiken für die Privatsphäre auf, die von biomedizinischen Daten wie DNA Methylierung, Mikrobiomdaten und bei der Aufnahme von Augenbewegungen vorkommen. Obwohl diese Daten weniger stabil sind als Genomdaten, deren Risiken der Forschung gut bekannt sind, und sich mehr unter Umwelteinflüssen ändern, können bekannte Angriffe angepasst werden und bedrohen die Privatsphäre der Datenspender. Dennoch ist das Teilen von Daten essentiell um biomedizinische Forschung voranzutreiben, denn Daten von einer ausreichend großen Studienpopulation zu sammeln ist aufwändig und teuer. Deshalb entwickeln wir als zweiten Schritt privatsphäre-erhaltende Techniken, die es Wissenschaftlern erlauben, solche biomedizinischen Daten zu teilen. Diese Techniken basieren im Wesentlichen auf differentieller Privatsphäre und feindlichen Beispielen und sind sorgfältig auf den konkreten Einsatzzweck angepasst um den Nutzen der Daten zu erhalten und gleichzeitig die Privatsphäre zu schützen

Universaar

Acronym

Recommended from our members

Efficient Neural Network Verification Using Branch and Bound

Author: Wang Shiqi
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2022
Field of study

Neural networks have demonstrated great success in modern machine learning systems. However, they remain susceptible to incorrect corner-case behaviors, often behaving unpredictably and producing surprisingly wrong results. Therefore, it is desirable to formally guarantee their trustworthiness for certain robustness properties when applied to safety-/security-sensitive systems like autonomous vehicles and aircraft. Unfortunately, the task is extremely challenging due to the complexity of neural networks, and traditional formal methods were not efficient enough to verify practical properties. Recently, a Branch and Bound (BaB) framework is generally extended for neural network verification and shows great success in accelerating the verification. This dissertation focuses on state-of-the-art neural network verifiers using BaB. We will first introduce two efficient neural network verifiers ReluVal and Neurify using basic BaB approaches involving two main steps: (1) They will recursively split the original verification problem into easier independent subproblems by splitting input or hidden neurons; (2) For each split subproblem, we propose an efficient and tight bound propagation method called symbolic interval analysis, producing sound estimated bounds for outputs using convex linear relaxations. Both ReluVal and Neurify are three orders of magnitude faster than previously state-of-the-art formal analysis systems on standard verification benchmarks. However, basic BaB approaches like Neurify have to construct each subproblem into a Linear Programming (LP) problem and solve it using expensive LP solvers, significantly limiting the overall efficiency. This is because each step of BaB will introduce neuron split constraints (e.g., a ReLU neuron larger or smaller than 0), which are hard to be handled by existing efficient bound propagation methods. We propose novel designs of bound propagation method -CROWN and its improved variance -CROWN, solving the verification problem by optimizing Lagrangian multipliers and with gradient ascent without requiring to call any expensive LP solvers. They were built based on previous work CROWN, a generalized efficient bound propagation method using linear relaxation. BaB verification using -CROWN and -CROWN cannot only provide tighter output estimations than most of the bound propagation methods but also can fully leverage the accelerations by GPUs with massive parallelization. Combining our methods with BaB empowers the state-of-the-art verifier ,-CROWN (alpha-beta-CROWN), the winning tool in the second International Verification of Neural Networks Competition (VNN-COMP 2021) with the highest total score. Our $\alpha,-CROWN can be three orders of magnitude faster than LP solver based BaB verifiers and is notably faster than all existing approaches on GPUs. Recently, we further generalize -CROWN and propose an efficient iterative approach that can tighten all intermediate layer bounds under neuron split constraints and strengthen the bound tightness without LP solvers. This new approach in BaB can greatly improve the efficiency of ,-CROWN, especially on several challenging benchmarks. Lastly, we study verifiable training that incorporates verification properties in training procedures to enhance the verifiable robustness of trained models and scale verification to larger models and datasets. We propose two general verifiable training frameworks: (1) MixTrain that can significantly improve verifiable training efficiency and scalability and (2) adaptive verifiable training that can improve trained verifiable robustness accounting for label similarity. The combination of verifiable training and BaB based verifiers opens promising directions for more efficient and scalable neural network verification

Columbia University Academic Commons