Adaptive Attractors: A Defense Strategy against ML Adversarial Collusion
  Attacks

Chang, Ee-Chien; Fang, Han; Zhang, Jiyi

Adaptive Attractors: A Defense Strategy against ML Adversarial Collusion Attacks

Authors: Ee-Chien Chang
Han Fang
Jiyi Zhang
Publication date: 2 June 2023
Publisher

Abstract

In the seller-buyer setting on machine learning models, the seller generates different copies based on the original model and distributes them to different buyers, such that adversarial samples generated on one buyer's copy would likely not work on other copies. A known approach achieves this using attractor-based rewriter which injects different attractors to different copies. This induces different adversarial regions in different copies, making adversarial samples generated on one copy not replicable on others. In this paper, we focus on a scenario where multiple malicious buyers collude to attack. We first give two formulations and conduct empirical studies to analyze effectiveness of collusion attack under different assumptions on the attacker's capabilities and properties of the attractors. We observe that existing attractor-based methods do not effectively mislead the colluders in the sense that adversarial samples found are influenced more by the original model instead of the attractors as number of colluders increases. Based on this observation, we propose using adaptive attractors whose weight is guided by a U-shape curve to cover the shortfalls. Experimentation results show that when using our approach, the attack success rate of a collusion attack converges to around 15% even when lots of copies are applied for collusion. In contrast, when using the existing attractor-based rewriter with fixed weight, the attack success rate increases linearly with the number of copies used for collusion

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2306.01400

Last time updated on 06/06/2023