Search CORE

414 research outputs found

의미보존 적대적 학습

Author: 이원석
Publication venue: 서울대학교 대학원
Publication date: 01/02/2021
Field of study

학위논문 (석사) -- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2021. 2. 이상구.Adversarial training is a defense technique that improves adversarial robustness of a deep neural network (DNN) by including adversarial examples in the training data. In this paper, we identify an overlooked problem of adversarial training in that these adversarial examples often have different semantics than the original data, introducing unintended biases into the model. We hypothesize that such non-semantics-preserving (and resultingly ambiguous) adversarial data harm the robustness of the target models. To mitigate such unintended semantic changes of adversarial examples, we propose semantics-preserving adversarial training (SPAT) which encourages perturbation on the pixels that are shared among all classes when generating adversarial examples in the training stage. Experiment results show that SPAT improves adversarial robustness and achieves state-of-the-art results in CIFAR-10, CIFAR-100, and STL-10.적대적 학습은 적대적 예제를 학습 데이터에 포함시킴으로써 심층 신경망의 적대적 강건성을 개선하는 방어 방법이다. 이 논문에서는 적대적 예제들이 원본 데이터와는 때때로 다른 의미를 가지며, 모델에 의도하지 않은 편향을 집어 넣는다는 기존에는 간과되어왔던 적대적 학습의 문제를 밝힌다. 우리는 이러한 의미를 보존하지 않는, 그리고 결과적으로 애매모호한 적대적 데이터가 목표 모델의 강건성을 해친다고 가설을 세웠다. 우리는 이러한 적대적 예제들의 의도하지 않은 의미적 변화를 완화하기 위해, 학습 단계에서 적대적 예제들을 생성할 때 모든 클래스들에게서 공유되는 픽셀에 교란하도록 권장하는, 의미 보존 적대적 학습을 제안한다. 실험 결과는 의미 보존 적대적 학습이 적대적 강건성을 개선하며, CIFAR-10과 CIFAR-100과 STL-10에서 최고의 성능을 달성함을 보인다.Chapter 1 Introduction 1 Chapter 2 Preliminaries 5 Chapter 3 Related Works 9 Chapter 4 Semantics-Preserving Adversarial Training 11 4.1 Problem of PGD-training . . . . . . . . . . . . . . . . . . . . . . 11 4.2 Semantics-Preserving Adversarial Training . . . . . . . . . . . . . 13 4.3 Combining with Adversarial Training Variants . . . . . . . . . . 14 Chapter 5 Analysis of Adversarial Examples 16 5.1 Visualizing Various Adversarial Examples . . . . . . . . . . . . . 16 5.2 Comparing the Attack Success Rate . . . . . . . . . . . . . . . . 17 Chapter 6 Experiments & Results 22 6.1 Evaluating Robustness . . . . . . . . . . . . . . . . . . . . . . . . 22 6.1.1 CIFAR-10 & CIFAR-100 . . . . . . . . . . . . . . . . . . . 22 6.1.2 CIFAR-10 with 500K Unlabeled Data . . . . . . . . . . . 24 6.1.3 STL-10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.2 Effect of Label Smoothing Hyperparameterα. . . . . . . . . . . 25 Chapter 7 Conclusion & Future Work 29Maste

SNU Open Repository and Archive

Principles of Neural Network Architecture Design - Invertibility and Domain Knowledge

Author: Behrmann Jens
Publication venue
Publication date: 01/01/2019
Field of study

Neural networks architectures allow a tremendous variety of design choices. In this work, we study two principles underlying these architectures: First, the design and application of invertible neural networks (INNs). Second, the incorporation of domain knowledge into neural network architectures. After introducing the mathematical foundations of deep learning, we address the invertibility of standard feedforward neural networks from a mathematical perspective. These results serve as a motivation for our proposed invertible residual networks (i-ResNets). This architecture class is then studied in two scenarios: First, we propose ways to use i-ResNets as a normalizing flow and demonstrate the applicability for high-dimensional generative modeling. Second, we study the excessive invariance of common deep image classifiers and discuss consequences for adversarial robustness. We finish with a study of convolutional neural networks for tumor classification based on imaging mass spectrometry (IMS) data. For this application, we propose an adapted architecture guided by our knowledge of the domain of IMS data and show its superior performance on two challenging tumor classification datasets

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen

Synthesizing Adversarial Examples for Neural Networks

Author: Rasineni Hasitha
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2019
Field of study

As machine learning is being integrated into more and more systems, such as autonomous vehicles or medical devices, they are also becoming entry points for attacks. Many sate-of-the-art neural networks have been proved, to be vulnerable to adversarial examples. These failures of machine learning models demonstrate that even simple algorithms can behave very differently from what their designers intend to. In order to close this gap between what designers intend to and how algorithms behave, there is a huge need for preventing adversarial examples to improve the credibility of the model. This study focuses on synthesizing adversarial examples using two different white box attacks - Fast Gradient Sign Method (FGSM) and Expectation Over Transfromation (EOT) Method

Digital Repository @ Iowa State University (ISU)