148 research outputs found
Fair Robust Active Learning by Joint Inconsistency
Fair Active Learning (FAL) utilized active learning techniques to achieve
high model performance with limited data and to reach fairness between
sensitive groups (e.g., genders). However, the impact of the adversarial
attack, which is vital for various safety-critical machine learning
applications, is not yet addressed in FAL. Observing this, we introduce a novel
task, Fair Robust Active Learning (FRAL), integrating conventional FAL and
adversarial robustness. FRAL requires ML models to leverage active learning
techniques to jointly achieve equalized performance on benign data and
equalized robustness against adversarial attacks between groups. In this new
task, previous FAL methods generally face the problem of unbearable
computational burden and ineffectiveness. Therefore, we develop a simple yet
effective FRAL strategy by Joint INconsistency (JIN). To efficiently find
samples that can boost the performance and robustness of disadvantaged groups
for labeling, our method exploits the prediction inconsistency between benign
and adversarial samples as well as between standard and robust models.
Extensive experiments under diverse datasets and sensitive groups demonstrate
that our method not only achieves fairer performance on benign samples but also
obtains fairer robustness under white-box PGD attacks compared with existing
active learning and FAL baselines. We are optimistic that FRAL would pave a new
path for developing safe and robust ML research and applications such as facial
attribute recognition in biometrics systems.Comment: 11 pages, 3 figure
ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector
Given the ability to directly manipulate image pixels in the digital input
space, an adversary can easily generate imperceptible perturbations to fool a
Deep Neural Network (DNN) image classifier, as demonstrated in prior work. In
this work, we propose ShapeShifter, an attack that tackles the more challenging
problem of crafting physical adversarial perturbations to fool image-based
object detectors like Faster R-CNN. Attacking an object detector is more
difficult than attacking an image classifier, as it needs to mislead the
classification results in multiple bounding boxes with different scales.
Extending the digital attack to the physical world adds another layer of
difficulty, because it requires the perturbation to be robust enough to survive
real-world distortions due to different viewing distances and angles, lighting
conditions, and camera limitations. We show that the Expectation over
Transformation technique, which was originally proposed to enhance the
robustness of adversarial perturbations in image classification, can be
successfully adapted to the object detection setting. ShapeShifter can generate
adversarially perturbed stop signs that are consistently mis-detected by Faster
R-CNN as other objects, posing a potential threat to autonomous vehicles and
other safety-critical computer vision systems
- …