414 research outputs found

    ์˜๋ฏธ๋ณด์กด ์ ๋Œ€์  ํ•™์Šต

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021. 2. ์ด์ƒ๊ตฌ.Adversarial training is a defense technique that improves adversarial robustness of a deep neural network (DNN) by including adversarial examples in the training data. In this paper, we identify an overlooked problem of adversarial training in that these adversarial examples often have different semantics than the original data, introducing unintended biases into the model. We hypothesize that such non-semantics-preserving (and resultingly ambiguous) adversarial data harm the robustness of the target models. To mitigate such unintended semantic changes of adversarial examples, we propose semantics-preserving adversarial training (SPAT) which encourages perturbation on the pixels that are shared among all classes when generating adversarial examples in the training stage. Experiment results show that SPAT improves adversarial robustness and achieves state-of-the-art results in CIFAR-10, CIFAR-100, and STL-10.์ ๋Œ€์  ํ•™์Šต์€ ์ ๋Œ€์  ์˜ˆ์ œ๋ฅผ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ํฌํ•จ์‹œํ‚ด์œผ๋กœ์จ ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์˜ ์ ๋Œ€์  ๊ฐ•๊ฑด์„ฑ์„ ๊ฐœ์„ ํ•˜๋Š” ๋ฐฉ์–ด ๋ฐฉ๋ฒ•์ด๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ ๋Œ€์  ์˜ˆ์ œ๋“ค์ด ์›๋ณธ ๋ฐ์ดํ„ฐ์™€๋Š” ๋•Œ๋•Œ๋กœ ๋‹ค๋ฅธ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง€๋ฉฐ, ๋ชจ๋ธ์— ์˜๋„ํ•˜์ง€ ์•Š์€ ํŽธํ–ฅ์„ ์ง‘์–ด ๋„ฃ๋Š”๋‹ค๋Š” ๊ธฐ์กด์—๋Š” ๊ฐ„๊ณผ๋˜์–ด์™”๋˜ ์ ๋Œ€์  ํ•™์Šต์˜ ๋ฌธ์ œ๋ฅผ ๋ฐํžŒ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด๋Ÿฌํ•œ ์˜๋ฏธ๋ฅผ ๋ณด์กดํ•˜์ง€ ์•Š๋Š”, ๊ทธ๋ฆฌ๊ณ  ๊ฒฐ๊ณผ์ ์œผ๋กœ ์• ๋งค๋ชจํ˜ธํ•œ ์ ๋Œ€์  ๋ฐ์ดํ„ฐ๊ฐ€ ๋ชฉํ‘œ ๋ชจ๋ธ์˜ ๊ฐ•๊ฑด์„ฑ์„ ํ•ด์นœ๋‹ค๊ณ  ๊ฐ€์„ค์„ ์„ธ์› ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด๋Ÿฌํ•œ ์ ๋Œ€์  ์˜ˆ์ œ๋“ค์˜ ์˜๋„ํ•˜์ง€ ์•Š์€ ์˜๋ฏธ์  ๋ณ€ํ™”๋ฅผ ์™„ํ™”ํ•˜๊ธฐ ์œ„ํ•ด, ํ•™์Šต ๋‹จ๊ณ„์—์„œ ์ ๋Œ€์  ์˜ˆ์ œ๋“ค์„ ์ƒ์„ฑํ•  ๋•Œ ๋ชจ๋“  ํด๋ž˜์Šค๋“ค์—๊ฒŒ์„œ ๊ณต์œ ๋˜๋Š” ํ”ฝ์…€์— ๊ต๋ž€ํ•˜๋„๋ก ๊ถŒ์žฅํ•˜๋Š”, ์˜๋ฏธ ๋ณด์กด ์ ๋Œ€์  ํ•™์Šต์„ ์ œ์•ˆํ•œ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ์˜๋ฏธ ๋ณด์กด ์ ๋Œ€์  ํ•™์Šต์ด ์ ๋Œ€์  ๊ฐ•๊ฑด์„ฑ์„ ๊ฐœ์„ ํ•˜๋ฉฐ, CIFAR-10๊ณผ CIFAR-100๊ณผ STL-10์—์„œ ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•จ์„ ๋ณด์ธ๋‹ค.Chapter 1 Introduction 1 Chapter 2 Preliminaries 5 Chapter 3 Related Works 9 Chapter 4 Semantics-Preserving Adversarial Training 11 4.1 Problem of PGD-training . . . . . . . . . . . . . . . . . . . . . . 11 4.2 Semantics-Preserving Adversarial Training . . . . . . . . . . . . . 13 4.3 Combining with Adversarial Training Variants . . . . . . . . . . 14 Chapter 5 Analysis of Adversarial Examples 16 5.1 Visualizing Various Adversarial Examples . . . . . . . . . . . . . 16 5.2 Comparing the Attack Success Rate . . . . . . . . . . . . . . . . 17 Chapter 6 Experiments & Results 22 6.1 Evaluating Robustness . . . . . . . . . . . . . . . . . . . . . . . . 22 6.1.1 CIFAR-10 & CIFAR-100 . . . . . . . . . . . . . . . . . . . 22 6.1.2 CIFAR-10 with 500K Unlabeled Data . . . . . . . . . . . 24 6.1.3 STL-10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.2 Effect of Label Smoothing Hyperparameterฮฑ. . . . . . . . . . . 25 Chapter 7 Conclusion & Future Work 29Maste

    Principles of Neural Network Architecture Design - Invertibility and Domain Knowledge

    Get PDF
    Neural networks architectures allow a tremendous variety of design choices. In this work, we study two principles underlying these architectures: First, the design and application of invertible neural networks (INNs). Second, the incorporation of domain knowledge into neural network architectures. After introducing the mathematical foundations of deep learning, we address the invertibility of standard feedforward neural networks from a mathematical perspective. These results serve as a motivation for our proposed invertible residual networks (i-ResNets). This architecture class is then studied in two scenarios: First, we propose ways to use i-ResNets as a normalizing flow and demonstrate the applicability for high-dimensional generative modeling. Second, we study the excessive invariance of common deep image classifiers and discuss consequences for adversarial robustness. We finish with a study of convolutional neural networks for tumor classification based on imaging mass spectrometry (IMS) data. For this application, we propose an adapted architecture guided by our knowledge of the domain of IMS data and show its superior performance on two challenging tumor classification datasets

    Synthesizing Adversarial Examples for Neural Networks

    Get PDF
    As machine learning is being integrated into more and more systems, such as autonomous vehicles or medical devices, they are also becoming entry points for attacks. Many sate-of-the-art neural networks have been proved, to be vulnerable to adversarial examples. These failures of machine learning models demonstrate that even simple algorithms can behave very differently from what their designers intend to. In order to close this gap between what designers intend to and how algorithms behave, there is a huge need for preventing adversarial examples to improve the credibility of the model. This study focuses on synthesizing adversarial examples using two different white box attacks - Fast Gradient Sign Method (FGSM) and Expectation Over Transfromation (EOT) Method
    • โ€ฆ
    corecore