47 research outputs found
Recommended from our members
Detecting, Diagnosing, Deflecting and Designing Adversarial Attacks
There has been an ongoing cycle between stronger attacks and stronger defenses in the adversarial machine learning game. However, most of the existing defenses are subsequently broken by a more advanced defense-aware attack. This dissertation first introduces a stronger detection mechanism based on Capsule networks which achieves state-of-the-art detection performance on both standard and defense-aware attacks. Then, we diagnose the adversarial examples against our CapsNet and find that the success of the adversarial attack is proportional to the visual similarity between the source and target class (which is not the case for CNN-based networks). Pushing this idea further, we show how it is possible to pressure the attacker to produce an input that visually resembles the attack’s target class, thereby deflecting the attack. These deflected attack images thus can no longer be called adversarial, as our network classifies them the same way as humans do. The existence of the deflected adversarial attacks also indicates the lp norm is not sufficient to ensure the same semantic class. Finally, this dissertation discusses how to design adversarial attacks for speech recognition systems based on human perception rather than the lp-norm metric
Effective Universal Unrestricted Adversarial Attacks using a MOE Approach
Recent studies have shown that Deep Leaning models are susceptible to
adversarial examples, which are data, in general images, intentionally modified
to fool a machine learning classifier. In this paper, we present a
multi-objective nested evolutionary algorithm to generate universal
unrestricted adversarial examples in a black-box scenario. The unrestricted
attacks are performed through the application of well-known image filters that
are available in several image processing libraries, modern cameras, and mobile
applications. The multi-objective optimization takes into account not only the
attack success rate but also the detection rate. Experimental results showed
that this approach is able to create a sequence of filters capable of
generating very effective and undetectable attacks