20 research outputs found
Human uncertainty makes classification more robust
The classification performance of deep neural networks has begun to asymptote
at near-perfect levels. However, their ability to generalize outside the
training set and their robustness to adversarial attacks have not. In this
paper, we make progress on this problem by training with full label
distributions that reflect human perceptual uncertainty. We first present a new
benchmark dataset which we call CIFAR10H, containing a full distribution of
human labels for each image of the CIFAR10 test set. We then show that, while
contemporary classifiers fail to exhibit human-like uncertainty on their own,
explicit training on our dataset closes this gap, supports improved
generalization to increasingly out-of-training-distribution test datasets, and
confers robustness to adversarial attacks.Comment: In Proceedings of the 2019 IEEE International Conference on Computer
Vision (ICCV
S2C2 -- An orthogonal method for Semi-Supervised Learning on ambiguous labels
Semi-Supervised Learning (SSL) can decrease the required amount of labeled
image data and thus the cost for deep learning. Most SSL methods assume a clear
distinction between classes, but class boundaries are often ambiguous in
real-world datasets due to intra- or interobserver variability. This ambiguity
of annotations must be addressed as it will otherwise limit the performance of
SSL and deep learning in general due to inconsistent label information. We
propose Semi-Supervised Classification & Clustering (S2C2) which can extend
many deep SSL algorithms. S2C2 automatically estimates the ambiguity of an
image and applies the respective SSL algorithm as a classification to certainly
labeled data while partitioning the ambiguous data into clusters of visual
similar images. We show that S2C2 results in a 7.6% better F1-score for
classifications and 7.9% lower inner distance of clusters on average across
multiple SSL algorithms and datasets. Moreover, the output of S2C2 can be used
to decrease the ambiguity of labels with the help of human experts. Overall, a
combination of Semi-Supervised Learning with our method S2C2 leads to better
handling of ambiguous labels and thus real-world datasets
On the Dark Side of Calibration for Modern Neural Networks
Modern neural networks are highly uncalibrated. It poses a significant
challenge for safety-critical systems to utilise deep neural networks (DNNs),
reliably. Many recently proposed approaches have demonstrated substantial
progress in improving DNN calibration. However, they hardly touch upon
refinement, which historically has been an essential aspect of calibration.
Refinement indicates separability of a network's correct and incorrect
predictions. This paper presents a theoretically and empirically supported
exposition for reviewing a model's calibration and refinement. Firstly, we show
the breakdown of expected calibration error (ECE), into predicted confidence
and refinement. Connecting with this result, we highlight that regularisation
based calibration only focuses on naively reducing a model's confidence. This
logically has a severe downside to a model's refinement. We support our claims
through rigorous empirical evaluations of many state of the art calibration
approaches on standard datasets. We find that many calibration approaches with
the likes of label smoothing, mixup etc. lower the utility of a DNN by
degrading its refinement. Even under natural data shift, this
calibration-refinement trade-off holds for the majority of calibration methods.
These findings call for an urgent retrospective into some popular pathways
taken for modern DNN calibration.Comment: 15 pages including references and supplementa
Enriching ImageNet with Human Similarity Judgments and Psychological Embeddings
Advances in object recognition flourished in part because of the availability
of high-quality datasets and associated benchmarks. However, these
benchmarks---such as ILSVRC---are relatively task-specific, focusing
predominately on predicting class labels. We introduce a publicly-available
dataset that embodies the task-general capabilities of human perception and
reasoning. The Human Similarity Judgments extension to ImageNet (ImageNet-HSJ)
is composed of human similarity judgments that supplement the ILSVRC validation
set. The new dataset supports a range of task and performance metrics,
including the evaluation of unsupervised learning algorithms. We demonstrate
two methods of assessment: using the similarity judgments directly and using a
psychological embedding trained on the similarity judgments. This embedding
space contains an order of magnitude more points (i.e., images) than previous
efforts based on human judgments. Scaling to the full 50,000 image set was made
possible through a selective sampling process that used variational Bayesian
inference and model ensembles to sample aspects of the embedding space that
were most uncertain. This methodological innovation not only enables scaling,
but should also improve the quality of solutions by focusing sampling where it
is needed. To demonstrate the utility of ImageNet-HSJ, we used the similarity
ratings and the embedding space to evaluate how well several popular models
conform to human similarity judgments. One finding is that more complex models
that perform better on task-specific benchmarks do not better conform to human
semantic judgments. In addition to the human similarity judgments, pre-trained
psychological embeddings and code for inferring variational embeddings are made
publicly available. Collectively, ImageNet-HSJ assets support the appraisal of
internal representations and the development of more human-like models
Computational scientific discovery in psychology
Scientific discovery is a driving force for progress, involving creative problem-solving processes to further our understanding of the world. Historically, the process of scientific discovery has been intensive and time-consuming; however, advances in computational power and algorithms have provided an efficient route to make new discoveries. Complex tools using artificial intelligence (AI) can efficiently analyse data as well as generate new hypotheses and theories. Along with AI becoming increasingly prevalent in our daily lives and the services we access, its application to different scientific domains is becoming more widespread. For example, AI has been used for early detection of medical conditions, identifying treatments and vaccines (e.g., against COVID-19), and predicting protein structure. The application of AI in psychological science has started to become popular. AI can assist in new discoveries both as a tool that allows more freedom to scientists to generate new theories, and by making creative discoveries autonomously. Conversely, psychological concepts such as heuristics have refined and improved artificial systems. With such powerful systems, however, there are key ethical and practical issues to consider. This review addresses the current and future directions of computational scientific discovery generally and its applications in psychological science more specifically