24,373 research outputs found
Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks
Transferability captures the ability of an attack against a machine-learning
model to be effective against a different, potentially unknown, model.
Empirical evidence for transferability has been shown in previous work, but the
underlying reasons why an attack transfers or not are not yet well understood.
In this paper, we present a comprehensive analysis aimed to investigate the
transferability of both test-time evasion and training-time poisoning attacks.
We provide a unifying optimization framework for evasion and poisoning attacks,
and a formal definition of transferability of such attacks. We highlight two
main factors contributing to attack transferability: the intrinsic adversarial
vulnerability of the target model, and the complexity of the surrogate model
used to optimize the attack. Based on these insights, we define three metrics
that impact an attack's transferability. Interestingly, our results derived
from theoretical analysis hold for both evasion and poisoning attacks, and are
confirmed experimentally using a wide range of linear and non-linear
classifiers and datasets
NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation
Complex computational models are often designed to simulate real-world
physical phenomena in many scientific disciplines. However, these simulation
models tend to be computationally very expensive and involve a large number of
simulation input parameters which need to be analyzed and properly calibrated
before the models can be applied for real scientific studies. We propose a
visual analysis system to facilitate interactive exploratory analysis of
high-dimensional input parameter space for a complex yeast cell polarization
simulation. The proposed system can assist the computational biologists, who
designed the simulation model, to visually calibrate the input parameters by
modifying the parameter values and immediately visualizing the predicted
simulation outcome without having the need to run the original expensive
simulation for every instance. Our proposed visual analysis system is driven by
a trained neural network-based surrogate model as the backend analysis
framework. Surrogate models are widely used in the field of simulation sciences
to efficiently analyze computationally expensive simulation models. In this
work, we demonstrate the advantage of using neural networks as surrogate models
for visual analysis by incorporating some of the recent advances in the field
of uncertainty quantification, interpretability and explainability of neural
network-based models. We utilize the trained network to perform interactive
parameter sensitivity analysis of the original simulation at multiple
levels-of-detail as well as recommend optimal parameter configurations using
the activation maximization framework of neural networks. We also facilitate
detail analysis of the trained network to extract useful insights about the
simulation model, learned by the network, during the training process.Comment: Published at IEEE Transactions on Visualization and Computer Graphic
Adherence and Constancy in LIME-RS Explanations for Recommendation
Explainable Recommendation has attracted a lot of attention due to a renewed interest in explainable artificial intelligence. In
particular, post-hoc approaches have proved to be the most easily applicable ones to increasingly complex recommendation
models, which are then treated as black boxes. The most recent literature has shown that for post-hoc explanations based
on local surrogate models, there are problems related to the robustness of the approach itself. This consideration becomes
even more relevant in human-related tasks like recommendation. The explanation also has the arduous task of enhancing
increasingly relevant aspects of user experience such as transparency or trustworthiness. This paper aims to show how
the characteristics of a classical post-hoc model based on surrogates is strongly model-dependent and does not prove to be
accountable for the explanations generatedThe authors acknowledge partial support of PID2019-108965GB-I00, PONARS01_00876BIO-D,CasadelleTecnologie
mergenti della Città di Matera, PONARS01_00821FLET4.0, PIAServiziLocali2.0,H2020Passapartout-Grantn. 101016956, PIAERP4.0,andIPZS-PRJ4_IA_NORMATIV
Context-aware feature attribution through argumentation
Feature attribution is a fundamental task in both machine learning and data
analysis, which involves determining the contribution of individual features or
variables to a model's output. This process helps identify the most important
features for predicting an outcome. The history of feature attribution methods
can be traced back to General Additive Models (GAMs), which extend linear
regression models by incorporating non-linear relationships between dependent
and independent variables. In recent years, gradient-based methods and
surrogate models have been applied to unravel complex Artificial Intelligence
(AI) systems, but these methods have limitations. GAMs tend to achieve lower
accuracy, gradient-based methods can be difficult to interpret, and surrogate
models often suffer from stability and fidelity issues. Furthermore, most
existing methods do not consider users' contexts, which can significantly
influence their preferences. To address these limitations and advance the
current state-of-the-art, we define a novel feature attribution framework
called Context-Aware Feature Attribution Through Argumentation (CA-FATA). Our
framework harnesses the power of argumentation by treating each feature as an
argument that can either support, attack or neutralize a prediction.
Additionally, CA-FATA formulates feature attribution as an argumentation
procedure, and each computation has explicit semantics, which makes it
inherently interpretable. CA-FATA also easily integrates side information, such
as users' contexts, resulting in more accurate predictions
Ethical Adversaries: Towards Mitigating Unfairness with Adversarial Machine Learning
Machine learning is being integrated into a growing number of critical
systems with far-reaching impacts on society. Unexpected behaviour and unfair
decision processes are coming under increasing scrutiny due to this widespread
use and its theoretical considerations. Individuals, as well as organisations,
notice, test, and criticize unfair results to hold model designers and
deployers accountable. We offer a framework that assists these groups in
mitigating unfair representations stemming from the training datasets. Our
framework relies on two inter-operating adversaries to improve fairness. First,
a model is trained with the goal of preventing the guessing of protected
attributes' values while limiting utility losses. This first step optimizes the
model's parameters for fairness. Second, the framework leverages evasion
attacks from adversarial machine learning to generate new examples that will be
misclassified. These new examples are then used to retrain and improve the
model in the first step. These two steps are iteratively applied until a
significant improvement in fairness is obtained. We evaluated our framework on
well-studied datasets in the fairness literature -- including COMPAS -- where
it can surpass other approaches concerning demographic parity, equality of
opportunity and also the model's utility. We also illustrate our findings on
the subtle difficulties when mitigating unfairness and highlight how our
framework can assist model designers.Comment: 15 pages, 3 figures, 1 tabl
- …