Un-fair trojan: Targeted backdoor attacks against model fairness

Furth, Nicholas

Un-fair trojan: Targeted backdoor attacks against model fairness

Authors: Nicholas Furth
Publication date: 31 May 2022
Publisher: Digital Commons @ NJIT

Abstract

Machine learning models have been shown to be vulnerable against various backdoor and data poisoning attacks that adversely affect model behavior. Additionally, these attacks have been shown to make unfair predictions with respect to certain protected features. In federated learning, multiple local models contribute to a single global model communicating only using local gradients, the issue of attacks become more prevalent and complex. Previously published works revolve around solving these issues both individually and jointly. However, there has been little study on the effects of attacks against model fairness. Demonstrated in this work, a flexible attack, which we call Un-Fair Trojan, that targets model fairness while remaining stealthy can have devastating effects against machine learning models

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Digital Commons @ New Jersey Institute of Technology (NJIT)

oai:digitalcommons.njit.edu:th...

Last time updated on 17/10/2022