15 research outputs found
Robustness in Fairness against Edge-level Perturbations in GNN-based Recommendation
Efforts in the recommendation community are shifting from the sole emphasis
on utility to considering beyond-utility factors, such as fairness and
robustness. Robustness of recommendation models is typically linked to their
ability to maintain the original utility when subjected to attacks. Limited
research has explored the robustness of a recommendation model in terms of
fairness, e.g., the parity in performance across groups, under attack
scenarios. In this paper, we aim to assess the robustness of graph-based
recommender systems concerning fairness, when exposed to attacks based on
edge-level perturbations. To this end, we considered four different fairness
operationalizations, including both consumer and provider perspectives.
Experiments on three datasets shed light on the impact of perturbations on the
targeted fairness notion, uncovering key shortcomings in existing evaluation
protocols for robustness. As an example, we observed perturbations affect
consumer fairness on a higher extent than provider fairness, with alarming
unfairness for the former. Source code:
https://github.com/jackmedda/CPFairRobus
Towards Poisoning Fair Representations
Fair machine learning seeks to mitigate model prediction bias against certain
demographic subgroups such as elder and female. Recently, fair representation
learning (FRL) trained by deep neural networks has demonstrated superior
performance, whereby representations containing no demographic information are
inferred from the data and then used as the input to classification or other
downstream tasks. Despite the development of FRL methods, their vulnerability
under data poisoning attack, a popular protocol to benchmark model robustness
under adversarial scenarios, is under-explored. Data poisoning attacks have
been developed for classical fair machine learning methods which incorporate
fairness constraints into shallow-model classifiers. Nonetheless, these attacks
fall short in FRL due to notably different fairness goals and model
architectures. This work proposes the first data poisoning framework attacking
FRL. We induce the model to output unfair representations that contain as much
demographic information as possible by injecting carefully crafted poisoning
samples into the training data. This attack entails a prohibitive bilevel
optimization, wherefore an effective approximated solution is proposed. A
theoretical analysis on the needed number of poisoning samples is derived and
sheds light on defending against the attack. Experiments on benchmark fairness
datasets and state-of-the-art fair representation learning models demonstrate
the superiority of our attack
Adversarial Attacks and Defenses in Explainable Artificial Intelligence: A Survey
Explainable artificial intelligence (XAI) methods are portrayed as a remedy
for debugging and trusting statistical and deep learning models, as well as
interpreting their predictions. However, recent advances in adversarial machine
learning (AdvML) highlight the limitations and vulnerabilities of
state-of-the-art explanation methods, putting their security and
trustworthiness into question. The possibility of manipulating, fooling or
fairwashing evidence of the model's reasoning has detrimental consequences when
applied in high-stakes decision-making and knowledge discovery. This survey
provides a comprehensive overview of research concerning adversarial attacks on
explanations of machine learning models, as well as fairness metrics. We
introduce a unified notation and taxonomy of methods facilitating a common
ground for researchers and practitioners from the intersecting research fields
of AdvML and XAI. We discuss how to defend against attacks and design robust
interpretation methods. We contribute a list of existing insecurities in XAI
and outline the emerging research directions in adversarial XAI (AdvXAI).
Future work should address improving explanation methods and evaluation
protocols to take into account the reported safety issues.Comment: A shorter version of this paper was presented at the IJCAI 2023
Workshop on Explainable A
Deceptive Fairness Attacks on Graphs via Meta Learning
We study deceptive fairness attacks on graphs to answer the following
question: How can we achieve poisoning attacks on a graph learning model to
exacerbate the bias deceptively? We answer this question via a bi-level
optimization problem and propose a meta learning-based framework named FATE.
FATE is broadly applicable with respect to various fairness definitions and
graph learning models, as well as arbitrary choices of manipulation operations.
We further instantiate FATE to attack statistical parity and individual
fairness on graph neural networks. We conduct extensive experimental
evaluations on real-world datasets in the task of semi-supervised node
classification. The experimental results demonstrate that FATE could amplify
the bias of graph neural networks with or without fairness consideration while
maintaining the utility on the downstream task. We hope this paper provides
insights into the adversarial robustness of fair graph learning and can shed
light on designing robust and fair graph learning in future studies.Comment: 23 pages, 11 table
Backdoor Learning for NLP: Recent Advances, Challenges, and Future Research Directions
Although backdoor learning is an active research topic in the NLP domain, the
literature lacks studies that systematically categorize and summarize backdoor
attacks and defenses. To bridge the gap, we present a comprehensive and
unifying study of backdoor learning for NLP by summarizing the literature in a
systematic manner. We first present and motivate the importance of backdoor
learning for building robust NLP systems. Next, we provide a thorough account
of backdoor attack techniques, their applications, defenses against backdoor
attacks, and various mitigation techniques to remove backdoor attacks. We then
provide a detailed review and analysis of evaluation metrics, benchmark
datasets, threat models, and challenges related to backdoor learning in NLP.
Ultimately, our work aims to crystallize and contextualize the landscape of
existing literature in backdoor learning for the text domain and motivate
further research in the field. To this end, we identify troubling gaps in the
literature and offer insights and ideas into open challenges and future
research directions. Finally, we provide a GitHub repository with a list of
backdoor learning papers that will be continuously updated at
https://github.com/marwanomar1/Backdoor-Learning-for-NLP
FLEA: Provably Fair Multisource Learning from Unreliable Training Data
Fairness-aware learning aims at constructing classifiers that not only make
accurate predictions, but do not discriminate against specific groups. It is a
fast-growing area of machine learning with far-reaching societal impact.
However, existing fair learning methods are vulnerable to accidental or
malicious artifacts in the training data, which can cause them to unknowingly
produce unfair classifiers. In this work we address the problem of fair
learning from unreliable training data in the robust multisource setting, where
the available training data comes from multiple sources, a fraction of which
might be not representative of the true data distribution. We introduce FLEA, a
filtering-based algorithm that allows the learning system to identify and
suppress those data sources that would have a negative impact on fairness or
accuracy if they were used for training. We show the effectiveness of our
approach by a diverse range of experiments on multiple datasets. Additionally
we prove formally that, given enough data, FLEA protects the learner against
unreliable data as long as the fraction of affected data sources is less than
half