Data Poisoning Attacks in Contextual Bandits

Jun, Kwang-Sung; Li, Lihong; Ma, Yuzhe; Zhu, Xiaojin

research

Data Poisoning Attacks in Contextual Bandits

Authors: Kwang-Sung Jun
Lihong Li
Yuzhe Ma
Xiaojin Zhu
Publication date: 23 August 2018
Publisher
Doi

Abstract

We study offline data poisoning attacks in contextual bandits, a class of reinforcement learning problems with important applications in online recommendation and adaptive medical treatment, among others. We provide a general attack framework based on convex optimization and show that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for a target contextual vector. The target arm and target contextual vector are both chosen by the attacker. That is, the attacker can hijack the behavior of a contextual bandit. We also investigate the feasibility and the side effects of such attacks, and identify future directions for defense. Experiments on both synthetic and real-world data demonstrate the efficiency of the attack algorithm.Comment: GameSec 201

Similar works

Full text

Available Versions

Crossref

Last time updated on 10/08/2021