1,313 research outputs found
Adversarial Machine Learning For Advanced Medical Imaging Systems
Although deep neural networks (DNNs) have achieved significant advancement in various challenging tasks of computer vision, they are also known to be vulnerable to so-called adversarial attacks. With only imperceptibly small perturbations added to a clean image, adversarial samples can drastically change models’ prediction, resulting in a significant drop in DNN’s performance. This phenomenon poses a serious threat to security-critical applications of DNNs, such as medical imaging, autonomous driving, and surveillance systems. In this dissertation, we present adversarial machine learning approaches for natural image classification and advanced medical imaging systems.
We start by describing our advanced medical imaging systems to tackle the major challenges of on-device deployment: automation, uncertainty, and resource constraint. It is followed by novel unsupervised and semi-supervised robust training schemes to enhance the adversarial robustness of these medical imaging systems. These methods are designed to tackle the unique challenges of defending against adversarial attacks on medical imaging systems and are sufficiently flexible to generalize to various medical imaging modalities and problems. We continue on developing novel training scheme to enhance adversarial robustness of the general DNN based natural image classification models. Based on a unique insight into the predictive behavior of DNNs that they tend to misclassify adversarial samples into the most probable false classes, we propose a new loss function as a drop-in replacement for the cross-entropy loss to improve DNN\u27s adversarial robustness. Specifically, it enlarges the probability gaps between true class and false classes and prevents them from being melted by small perturbations. Finally, we conclude the dissertation by summarizing original contributions and discussing our future work that leverages DNN interpretability constraint on adversarial training to tackle the central machine learning problem of generalization gap
Revisiting Transferable Adversarial Image Examples: Attack Categorization, Evaluation Guidelines, and New Insights
Transferable adversarial examples raise critical security concerns in
real-world, black-box attack scenarios. However, in this work, we identify two
main problems in common evaluation practices: (1) For attack transferability,
lack of systematic, one-to-one attack comparison and fair hyperparameter
settings. (2) For attack stealthiness, simply no comparisons. To address these
problems, we establish new evaluation guidelines by (1) proposing a novel
attack categorization strategy and conducting systematic and fair
intra-category analyses on transferability, and (2) considering diverse
imperceptibility metrics and finer-grained stealthiness characteristics from
the perspective of attack traceback. To this end, we provide the first
large-scale evaluation of transferable adversarial examples on ImageNet,
involving 23 representative attacks against 9 representative defenses. Our
evaluation leads to a number of new insights, including consensus-challenging
ones: (1) Under a fair attack hyperparameter setting, one early attack method,
DI, actually outperforms all the follow-up methods. (2) A state-of-the-art
defense, DiffPure, actually gives a false sense of (white-box) security since
it is indeed largely bypassed by our (black-box) transferable attacks. (3) Even
when all attacks are bounded by the same norm, they lead to dramatically
different stealthiness performance, which negatively correlates with their
transferability performance. Overall, our work demonstrates that existing
problematic evaluations have indeed caused misleading conclusions and missing
points, and as a result, hindered the assessment of the actual progress in this
field.Comment: Code is available at
https://github.com/ZhengyuZhao/TransferAttackEva
On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training
Aspect-based sentiment analysis (ABSA) aims at automatically inferring the
specific sentiment polarities toward certain aspects of products or services
behind the social media texts or reviews, which has been a fundamental
application to the real-world society. Since the early 2010s, ABSA has achieved
extraordinarily high accuracy with various deep neural models. However,
existing ABSA models with strong in-house performances may fail to generalize
to some challenging cases where the contexts are variable, i.e., low robustness
to real-world environments. In this study, we propose to enhance the ABSA
robustness by systematically rethinking the bottlenecks from all possible
angles, including model, data, and training. First, we strengthen the current
best-robust syntax-aware models by further incorporating the rich external
syntactic dependencies and the labels with aspect simultaneously with a
universal-syntax graph convolutional network. In the corpus perspective, we
propose to automatically induce high-quality synthetic training data with
various types, allowing models to learn sufficient inductive bias for better
robustness. Last, we based on the rich pseudo data perform adversarial training
to enhance the resistance to the context perturbation and meanwhile employ
contrastive learning to reinforce the representations of instances with
contrastive sentiments. Extensive robustness evaluations are conducted. The
results demonstrate that our enhanced syntax-aware model achieves better
robustness performances than all the state-of-the-art baselines. By
additionally incorporating our synthetic corpus, the robust testing results are
pushed with around 10% accuracy, which are then further improved by installing
the advanced training strategies. In-depth analyses are presented for revealing
the factors influencing the ABSA robustness.Comment: Accepted in ACM Transactions on Information System
Stylized Adversarial Defense
Deep Convolution Neural Networks (CNNs) can easily be fooled by subtle,
imperceptible changes to the input images. To address this vulnerability,
adversarial training creates perturbation patterns and includes them in the
training set to robustify the model. In contrast to existing adversarial
training methods that only use class-boundary information (e.g., using a cross
entropy loss), we propose to exploit additional information from the feature
space to craft stronger adversaries that are in turn used to learn a robust
model. Specifically, we use the style and content information of the target
sample from another class, alongside its class boundary information to create
adversarial perturbations. We apply our proposed multi-task objective in a
deeply supervised manner, extracting multi-scale feature knowledge to create
maximally separating adversaries. Subsequently, we propose a max-margin
adversarial training approach that minimizes the distance between source image
and its adversary and maximizes the distance between the adversary and the
target image. Our adversarial training approach demonstrates strong robustness
compared to state of the art defenses, generalizes well to naturally occurring
corruptions and data distributional shifts, and retains the model accuracy on
clean examples.Comment: Code is available at this http https://github.com/Muzammal-Naseer/SA
- …