Search CORE

2 research outputs found

SoK: Pitfalls in Evaluating Black-Box Attacks

Author: Evans David
Hong Jingtao
Suri Anshuman
Suya Fnu
Tian Yuan
Zhang Tingwei
Publication venue
Publication date: 26/10/2023
Field of study

Numerous works study black-box attacks on image classifiers. However, these works make different assumptions on the adversary's knowledge and current literature lacks a cohesive organization centered around the threat model. To systematize knowledge in this area, we propose a taxonomy over the threat space spanning the axes of feedback granularity, the access of interactive queries, and the quality and quantity of the auxiliary data available to the attacker. Our new taxonomy provides three key insights. 1) Despite extensive literature, numerous under-explored threat spaces exist, which cannot be trivially solved by adapting techniques from well-explored settings. We demonstrate this by establishing a new state-of-the-art in the less-studied setting of access to top-k confidence scores by adapting techniques from well-explored settings of accessing the complete confidence vector, but show how it still falls short of the more restrictive setting that only obtains the prediction label, highlighting the need for more research. 2) Identification the threat model of different attacks uncovers stronger baselines that challenge prior state-of-the-art claims. We demonstrate this by enhancing an initially weaker baseline (under interactive query access) via surrogate models, effectively overturning claims in the respective paper. 3) Our taxonomy reveals interactions between attacker knowledge that connect well to related areas, such as model inversion and extraction attacks. We discuss how advances in other areas can enable potentially stronger black-box attacks. Finally, we emphasize the need for a more realistic assessment of attack success by factoring in local attack runtime. This approach reveals the potential for certain attacks to achieve notably higher success rates and the need to evaluate attacks in diverse and harder settings, highlighting the need for better selection criteria

arXiv.org e-Print Archive

A better understanding of machine learning malware misclassifcation

Author: A Shabtai
Albert Bifet
BW Yap
BY Zhang
C Cortes
C Ferri
FH Hsu
J Huang
L Breiman
M Bailey
P Kang
Q Miao
R Islam
TM Khoshgoftaar
W.-J. Lin
Y Bengio
Y Ye
Y Ye
Y Ye
YB Lu
Z Salehi
Publication venue: Springer Verlag
Publication date: 01/01/2018
Field of study

Crossref

University of Birmingham Research Portal