2,748 research outputs found
Unsupervised Learning via Total Correlation Explanation
Learning by children and animals occurs effortlessly and largely without
obvious supervision. Successes in automating supervised learning have not
translated to the more ambiguous realm of unsupervised learning where goals and
labels are not provided. Barlow (1961) suggested that the signal that brains
leverage for unsupervised learning is dependence, or redundancy, in the sensory
environment. Dependence can be characterized using the information-theoretic
multivariate mutual information measure called total correlation. The principle
of Total Cor-relation Ex-planation (CorEx) is to learn representations of data
that "explain" as much dependence in the data as possible. We review some
manifestations of this principle along with successes in unsupervised learning
problems across diverse domains including human behavior, biology, and
language.Comment: Invited contribution for IJCAI 2017 Early Career Spotlight. 5 pages,
1 figur
Invariant Information Clustering for Unsupervised Image Classification and Segmentation
We present a novel clustering objective that learns a neural network
classifier from scratch, given only unlabelled data samples. The model
discovers clusters that accurately match semantic classes, achieving
state-of-the-art results in eight unsupervised clustering benchmarks spanning
image classification and segmentation. These include STL10, an unsupervised
variant of ImageNet, and CIFAR10, where we significantly beat the accuracy of
our closest competitors by 6.6 and 9.5 absolute percentage points respectively.
The method is not specialised to computer vision and operates on any paired
dataset samples; in our experiments we use random transforms to obtain a pair
from each image. The trained network directly outputs semantic labels, rather
than high dimensional representations that need external processing to be
usable for semantic clustering. The objective is simply to maximise mutual
information between the class assignments of each pair. It is easy to implement
and rigorously grounded in information theory, meaning we effortlessly avoid
degenerate solutions that other clustering methods are susceptible to. In
addition to the fully unsupervised mode, we also test two semi-supervised
settings. The first achieves 88.8% accuracy on STL10 classification, setting a
new global state-of-the-art over all existing methods (whether supervised,
semi-supervised or unsupervised). The second shows robustness to 90% reductions
in label coverage, of relevance to applications that wish to make use of small
amounts of labels. github.com/xu-ji/IICComment: International Conference on Computer Vision 201
To Compress or Not to Compress -- Self-Supervised Learning and Information Theory: A Review
Deep neural networks have demonstrated remarkable performance in supervised
learning tasks but require large amounts of labeled data. Self-supervised
learning offers an alternative paradigm, enabling the model to learn from data
without explicit labels. Information theory has been instrumental in
understanding and optimizing deep neural networks. Specifically, the
information bottleneck principle has been applied to optimize the trade-off
between compression and relevant information preservation in supervised
settings. However, the optimal information objective in self-supervised
learning remains unclear. In this paper, we review various approaches to
self-supervised learning from an information-theoretic standpoint and present a
unified framework that formalizes the \textit{self-supervised
information-theoretic learning problem}. We integrate existing research into a
coherent framework, examine recent self-supervised methods, and identify
research opportunities and challenges. Moreover, we discuss empirical
measurement of information-theoretic quantities and their estimators. This
paper offers a comprehensive review of the intersection between information
theory, self-supervised learning, and deep neural networks
- …