36 research outputs found

    Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning

    Full text link
    Intentionally crafted adversarial samples have effectively exploited weaknesses in deep neural networks. A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample such that its corresponding model output changes. These sensitivity attacks exploit the model's sensitivity toward task-irrelevant features. Another form of adversarial sample can be crafted via invariance attacks, which exploit the model underestimating the importance of relevant features. Previous literature has indicated a tradeoff in defending against both attack types within a strictly L_p bounded defense. To promote robustness toward both types of attacks beyond Euclidean distance metrics, we use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.Comment: v

    Event sequence metric learning

    Full text link
    In this paper we consider a challenging problem of learning discriminative vector representations for event sequences generated by real-world users. Vector representations map behavioral client raw data to the low-dimensional fixed-length vectors in the latent space. We propose a novel method of learning those vector embeddings based on metric learning approach. We propose a strategy of raw data subsequences generation to apply a metric learning approach in a fully self-supervised way. We evaluated the method over several public bank transactions datasets and showed that self-supervised embeddings outperform other methods when applied to downstream classification tasks. Moreover, embeddings are compact and provide additional user privacy protection

    Adversarial Feature Stacking for Accurate and Robust Predictions

    Full text link
    Deep Neural Networks (DNNs) have achieved remarkable performance on a variety of applications but are extremely vulnerable to adversarial perturbation. To address this issue, various defense methods have been proposed to enhance model robustness. Unfortunately, the most representative and promising methods, such as adversarial training and its variants, usually degrade model accuracy on benign samples, limiting practical utility. This indicates that it is difficult to extract both robust and accurate features using a single network under certain conditions, such as limited training data, resulting in a trade-off between accuracy and robustness. To tackle this problem, we propose an Adversarial Feature Stacking (AFS) model that can jointly take advantage of features with varied levels of robustness and accuracy, thus significantly alleviating the aforementioned trade-off. Specifically, we adopt multiple networks adversarially trained with different perturbation budgets to extract either more robust features or more accurate features. These features are then fused by a learnable merger to give final predictions. We evaluate the AFS model on CIFAR-10 and CIFAR-100 datasets with strong adaptive attack methods, which significantly advances the state-of-the-art in terms of the trade-off. Without extra training data, the AFS model achieves a benign accuracy improvement of 6% on CIFAR-10 and 9% on CIFAR-100 with comparable or even stronger robustness than the state-of-the-art adversarial training methods. This work demonstrates the feasibility to obtain both accurate and robust models under the circumstances of limited training data

    의미보쑴 μ λŒ€μ  ν•™μŠ΅

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (석사) -- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 컴퓨터곡학뢀, 2021. 2. 이상ꡬ.Adversarial training is a defense technique that improves adversarial robustness of a deep neural network (DNN) by including adversarial examples in the training data. In this paper, we identify an overlooked problem of adversarial training in that these adversarial examples often have different semantics than the original data, introducing unintended biases into the model. We hypothesize that such non-semantics-preserving (and resultingly ambiguous) adversarial data harm the robustness of the target models. To mitigate such unintended semantic changes of adversarial examples, we propose semantics-preserving adversarial training (SPAT) which encourages perturbation on the pixels that are shared among all classes when generating adversarial examples in the training stage. Experiment results show that SPAT improves adversarial robustness and achieves state-of-the-art results in CIFAR-10, CIFAR-100, and STL-10.μ λŒ€μ  ν•™μŠ΅μ€ μ λŒ€μ  예제λ₯Ό ν•™μŠ΅ 데이터에 ν¬ν•¨μ‹œν‚΄μœΌλ‘œμ¨ 심측 μ‹ κ²½λ§μ˜ μ λŒ€μ  강건성을 κ°œμ„ ν•˜λŠ” λ°©μ–΄ 방법이닀. 이 λ…Όλ¬Έμ—μ„œλŠ” μ λŒ€μ  μ˜ˆμ œλ“€μ΄ 원본 λ°μ΄ν„°μ™€λŠ” λ•Œλ•Œλ‘œ λ‹€λ₯Έ 의미λ₯Ό 가지며, λͺ¨λΈμ— μ˜λ„ν•˜μ§€ μ•Šμ€ 편ν–₯을 집어 λ„£λŠ”λ‹€λŠ” κΈ°μ‘΄μ—λŠ” κ°„κ³Όλ˜μ–΄μ™”λ˜ μ λŒ€μ  ν•™μŠ΅μ˜ 문제λ₯Ό λ°νžŒλ‹€. μš°λ¦¬λŠ” μ΄λŸ¬ν•œ 의미λ₯Ό λ³΄μ‘΄ν•˜μ§€ μ•ŠλŠ”, 그리고 결과적으둜 애맀λͺ¨ν˜Έν•œ μ λŒ€μ  데이터가 λͺ©ν‘œ λͺ¨λΈμ˜ 강건성을 ν•΄μΉœλ‹€κ³  가섀을 μ„Έμ› λ‹€. μš°λ¦¬λŠ” μ΄λŸ¬ν•œ μ λŒ€μ  μ˜ˆμ œλ“€μ˜ μ˜λ„ν•˜μ§€ μ•Šμ€ 의미적 λ³€ν™”λ₯Ό μ™„ν™”ν•˜κΈ° μœ„ν•΄, ν•™μŠ΅ λ‹¨κ³„μ—μ„œ μ λŒ€μ  μ˜ˆμ œλ“€μ„ 생성할 λ•Œ λͺ¨λ“  ν΄λž˜μŠ€λ“€μ—κ²Œμ„œ κ³΅μœ λ˜λŠ” 픽셀에 κ΅λž€ν•˜λ„λ‘ ꢌμž₯ν•˜λŠ”, 의미 보쑴 μ λŒ€μ  ν•™μŠ΅μ„ μ œμ•ˆν•œλ‹€. μ‹€ν—˜ κ²°κ³ΌλŠ” 의미 보쑴 μ λŒ€μ  ν•™μŠ΅μ΄ μ λŒ€μ  강건성을 κ°œμ„ ν•˜λ©°, CIFAR-10κ³Ό CIFAR-100κ³Ό STL-10μ—μ„œ 졜고의 μ„±λŠ₯을 달성함을 보인닀.Chapter 1 Introduction 1 Chapter 2 Preliminaries 5 Chapter 3 Related Works 9 Chapter 4 Semantics-Preserving Adversarial Training 11 4.1 Problem of PGD-training . . . . . . . . . . . . . . . . . . . . . . 11 4.2 Semantics-Preserving Adversarial Training . . . . . . . . . . . . . 13 4.3 Combining with Adversarial Training Variants . . . . . . . . . . 14 Chapter 5 Analysis of Adversarial Examples 16 5.1 Visualizing Various Adversarial Examples . . . . . . . . . . . . . 16 5.2 Comparing the Attack Success Rate . . . . . . . . . . . . . . . . 17 Chapter 6 Experiments & Results 22 6.1 Evaluating Robustness . . . . . . . . . . . . . . . . . . . . . . . . 22 6.1.1 CIFAR-10 & CIFAR-100 . . . . . . . . . . . . . . . . . . . 22 6.1.2 CIFAR-10 with 500K Unlabeled Data . . . . . . . . . . . 24 6.1.3 STL-10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.2 Effect of Label Smoothing HyperparameterΞ±. . . . . . . . . . . 25 Chapter 7 Conclusion & Future Work 29Maste
    corecore