14 research outputs found
Monitoring bias and fairness in machine learning models: A review
Introduction: Machine learning algorithms are quickly gaining traction in both the private and public sectors for their ability to automate both simple and complex decision-making processes. The vast majority of economic sectors, including transportation, retail, advertisement, and energy, are being disrupted by widespread data digitization and the emerging technologies that leverage it. Computerized systems are being introduced in government operations to improve accuracy and objectivity, and AI is having an impact on democracy and governance [1]
Unifying Gradients to Improve Real-world Robustness for Deep Networks
The wide application of deep neural networks (DNNs) demands an increasing
amount of attention to their real-world robustness, i.e., whether a DNN resists
black-box adversarial attacks, among which score-based query attacks (SQAs) are
most threatening since they can effectively hurt a victim network with the only
access to model outputs. Defending against SQAs requires a slight but artful
variation of outputs due to the service purpose for users, who share the same
output information with SQAs. In this paper, we propose a real-world defense by
Unifying Gradients (UniG) of different data so that SQAs could only probe a
much weaker attack direction that is similar for different samples. Since such
universal attack perturbations have been validated as less aggressive than the
input-specific perturbations, UniG protects real-world DNNs by indicating
attackers a twisted and less informative attack direction. We implement UniG
efficiently by a Hadamard product module which is plug-and-play. According to
extensive experiments on 5 SQAs, 2 adaptive attacks and 7 defense baselines,
UniG significantly improves real-world robustness without hurting clean
accuracy on CIFAR10 and ImageNet. For instance, UniG maintains a model of
77.80% accuracy under 2500-query Square attack while the state-of-the-art
adversarially-trained model only has 67.34% on CIFAR10. Simultaneously, UniG
outperforms all compared baselines in terms of clean accuracy and achieves the
smallest modification of the model output. The code is released at
https://github.com/snowien/UniG-pytorch
Determining Sequence of Image Processing Technique (IPT) to Detect Adversarial Attacks
Developing secure machine learning models from adversarial examples is
challenging as various methods are continually being developed to generate
adversarial attacks. In this work, we propose an evolutionary approach to
automatically determine Image Processing Techniques Sequence (IPTS) for
detecting malicious inputs. Accordingly, we first used a diverse set of attack
methods including adaptive attack methods (on our defense) to generate
adversarial samples from the clean dataset. A detection framework based on a
genetic algorithm (GA) is developed to find the optimal IPTS, where the
optimality is estimated by different fitness measures such as Euclidean
distance, entropy loss, average histogram, local binary pattern and loss
functions. The "image difference" between the original and processed images is
used to extract the features, which are then fed to a classification scheme in
order to determine whether the input sample is adversarial or clean. This paper
described our methodology and performed experiments using multiple data-sets
tested with several adversarial attacks. For each attack-type and dataset, it
generates unique IPTS. A set of IPTS selected dynamically in testing time which
works as a filter for the adversarial attack. Our empirical experiments
exhibited promising results indicating the approach can efficiently be used as
processing for any AI model
Adversarial Attacks and Defenses in 6G Network-Assisted IoT Systems
The Internet of Things (IoT) and massive IoT systems are key to
sixth-generation (6G) networks due to dense connectivity, ultra-reliability,
low latency, and high throughput. Artificial intelligence, including deep
learning and machine learning, offers solutions for optimizing and deploying
cutting-edge technologies for future radio communications. However, these
techniques are vulnerable to adversarial attacks, leading to degraded
performance and erroneous predictions, outcomes unacceptable for ubiquitous
networks. This survey extensively addresses adversarial attacks and defense
methods in 6G network-assisted IoT systems. The theoretical background and
up-to-date research on adversarial attacks and defenses are discussed.
Furthermore, we provide Monte Carlo simulations to validate the effectiveness
of adversarial attacks compared to jamming attacks. Additionally, we examine
the vulnerability of 6G IoT systems by demonstrating attack strategies
applicable to key technologies, including reconfigurable intelligent surfaces,
massive multiple-input multiple-output (MIMO)/cell-free massive MIMO,
satellites, the metaverse, and semantic communications. Finally, we outline the
challenges and future developments associated with adversarial attacks and
defenses in 6G IoT systems.Comment: 17 pages, 5 figures, and 4 tables. Submitted for publication
Blacklight: Defending Black-Box Adversarial Attacks on Deep Neural Networks
The vulnerability of deep neural networks (DNNs) to adversarial examples is
well documented. Under the strong white-box threat model, where attackers have
full access to DNN internals, recent work has produced continual advancements
in defenses, often followed by more powerful attacks that break them.
Meanwhile, research on the more realistic black-box threat model has focused
almost entirely on reducing the query-cost of attacks, making them increasingly
practical for ML models already deployed today.
This paper proposes and evaluates Blacklight, a new defense against black-box
adversarial attacks. Blacklight targets a key property of black-box attacks: to
compute adversarial examples, they produce sequences of highly similar images
while trying to minimize the distance from some initial benign input. To detect
an attack, Blacklight computes for each query image a compact set of one-way
hash values that form a probabilistic fingerprint. Variants of an image produce
nearly identical fingerprints, and fingerprint generation is robust against
manipulation. We evaluate Blacklight on 5 state-of-the-art black-box attacks,
across a variety of models and classification tasks. While the most efficient
attacks take thousands or tens of thousands of queries to complete, Blacklight
identifies them all, often after only a handful of queries. Blacklight is also
robust against several powerful countermeasures, including an optimal black-box
attack that approximates white-box attacks in efficiency. Finally, Blacklight
significantly outperforms the only known alternative in both detection coverage
of attack queries and resistance against persistent attackers