47,876 research outputs found
Exploring the Space of Adversarial Images
Adversarial examples have raised questions regarding the robustness and
security of deep neural networks. In this work we formalize the problem of
adversarial images given a pretrained classifier, showing that even in the
linear case the resulting optimization problem is nonconvex. We generate
adversarial images using shallow and deep classifiers on the MNIST and ImageNet
datasets. We probe the pixel space of adversarial images using noise of varying
intensity and distribution. We bring novel visualizations that showcase the
phenomenon and its high variability. We show that adversarial images appear in
large regions in the pixel space, but that, for the same task, a shallow
classifier seems more robust to adversarial images than a deep convolutional
network.Comment: Copyright 2016 IEEE. This manuscript was accepted at the IEEE
International Joint Conference on Neural Networks (IJCNN) 2016. We will link
the published version as soon as the DOI is availabl
The Devil of Face Recognition is in the Noise
The growing scale of face recognition datasets empowers us to train strong
convolutional networks for face recognition. While a variety of architectures
and loss functions have been devised, we still have a limited understanding of
the source and consequence of label noise inherent in existing datasets. We
make the following contributions: 1) We contribute cleaned subsets of popular
face databases, i.e., MegaFace and MS-Celeb-1M datasets, and build a new
large-scale noise-controlled IMDb-Face dataset. 2) With the original datasets
and cleaned subsets, we profile and analyze label noise properties of MegaFace
and MS-Celeb-1M. We show that a few orders more samples are needed to achieve
the same accuracy yielded by a clean subset. 3) We study the association
between different types of noise, i.e., label flips and outliers, with the
accuracy of face recognition models. 4) We investigate ways to improve data
cleanliness, including a comprehensive user study on the influence of data
labeling strategies to annotation accuracy. The IMDb-Face dataset has been
released on https://github.com/fwang91/IMDb-Face.Comment: accepted to ECCV'1
Robust Decision Trees Against Adversarial Examples
Although adversarial examples and model robustness have been extensively
studied in the context of linear models and neural networks, research on this
issue in tree-based models and how to make tree-based models robust against
adversarial examples is still limited. In this paper, we show that tree based
models are also vulnerable to adversarial examples and develop a novel
algorithm to learn robust trees. At its core, our method aims to optimize the
performance under the worst-case perturbation of input features, which leads to
a max-min saddle point problem. Incorporating this saddle point objective into
the decision tree building procedure is non-trivial due to the discrete nature
of trees --- a naive approach to finding the best split according to this
saddle point objective will take exponential time. To make our approach
practical and scalable, we propose efficient tree building algorithms by
approximating the inner minimizer in this saddle point problem, and present
efficient implementations for classical information gain based trees as well as
state-of-the-art tree boosting models such as XGBoost. Experimental results on
real world datasets demonstrate that the proposed algorithms can substantially
improve the robustness of tree-based models against adversarial examples
CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
In this paper, we study the problem of learning image classification models
with label noise. Existing approaches depending on human supervision are
generally not scalable as manually identifying correct or incorrect labels is
time-consuming, whereas approaches not relying on human supervision are
scalable but less effective. To reduce the amount of human supervision for
label noise cleaning, we introduce CleanNet, a joint neural embedding network,
which only requires a fraction of the classes being manually verified to
provide the knowledge of label noise that can be transferred to other classes.
We further integrate CleanNet and conventional convolutional neural network
classifier into one framework for image classification learning. We demonstrate
the effectiveness of the proposed algorithm on both of the label noise
detection task and the image classification on noisy data task on several
large-scale datasets. Experimental results show that CleanNet can reduce label
noise detection error rate on held-out classes where no human supervision
available by 41.5% compared to current weakly supervised methods. It also
achieves 47% of the performance gain of verifying all images with only 3.2%
images verified on an image classification task. Source code and dataset will
be available at kuanghuei.github.io/CleanNetProject.Comment: Accepted to CVPR 201
Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning
This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/ licenses/by/4.0
- …