13 research outputs found

    Analyzing and Improving Representations with the Soft Nearest Neighbor Loss

    Get PDF
    We explore and expand the Soft Nearest Neighbor Loss\textit{Soft Nearest Neighbor Loss} to measure the entanglement\textit{entanglement} of class manifolds in representation space: i.e., how close pairs of points from the same class are relative to pairs of points from different classes. We demonstrate several use cases of the loss. As an analytical tool, it provides insights into the evolution of class similarity structures during learning. Surprisingly, we find that maximizing\textit{maximizing} the entanglement of representations of different classes in the hidden layers is beneficial for discrimination in the final layer, possibly because it encourages representations to identify class-independent similarity structures. Maximizing the soft nearest neighbor loss in the hidden layers leads not only to improved generalization but also to better-calibrated estimates of uncertainty on outlier data. Data that is not from the training distribution can be recognized by observing that in the hidden layers, it has fewer than the normal number of neighbors from the predicted class

    MetaAdvDet: Towards Robust Detection of Evolving Adversarial Attacks

    Full text link
    Deep neural networks (DNNs) are vulnerable to adversarial attack which is maliciously implemented by adding human-imperceptible perturbation to images and thus leads to incorrect prediction. Existing studies have proposed various methods to detect the new adversarial attacks. However, new attack methods keep evolving constantly and yield new adversarial examples to bypass the existing detectors. It needs to collect tens of thousands samples to train detectors, while the new attacks evolve much more frequently than the high-cost data collection. Thus, this situation leads the newly evolved attack samples to remain in small scales. To solve such few-shot problem with the evolving attack, we propose a meta-learning based robust detection method to detect new adversarial attacks with limited examples. Specifically, the learning consists of a double-network framework: a task-dedicated network and a master network which alternatively learn the detection capability for either seen attack or a new attack. To validate the effectiveness of our approach, we construct the benchmarks with few-shot-fashion protocols based on three conventional datasets, i.e. CIFAR-10, MNIST and Fashion-MNIST. Comprehensive experiments are conducted on them to verify the superiority of our approach with respect to the traditional adversarial attack detection methods.Comment: 10 pages, 2 figures, accepted as the conference paper of Proceedings of the 27th ACM International Conference on Multimedia (MM'19
    corecore