29,163 research outputs found
Information Bottleneck
The celebrated information bottleneck (IB) principle of Tishby et al. has recently enjoyed renewed attention due to its application in the area of deep learning. This collection investigates the IB principle in this new context. The individual chapters in this collection: • provide novel insights into the functional properties of the IB; • discuss the IB principle (and its derivates) as an objective for training multi-layer machine learning structures such as neural networks and decision trees; and • offer a new perspective on neural network learning via the lens of the IB framework. Our collection thus contributes to a better understanding of the IB principle specifically for deep learning and, more generally, of information–theoretic cost functions in machine learning. This paves the way toward explainable artificial intelligence
To Compress or Not to Compress -- Self-Supervised Learning and Information Theory: A Review
Deep neural networks have demonstrated remarkable performance in supervised
learning tasks but require large amounts of labeled data. Self-supervised
learning offers an alternative paradigm, enabling the model to learn from data
without explicit labels. Information theory has been instrumental in
understanding and optimizing deep neural networks. Specifically, the
information bottleneck principle has been applied to optimize the trade-off
between compression and relevant information preservation in supervised
settings. However, the optimal information objective in self-supervised
learning remains unclear. In this paper, we review various approaches to
self-supervised learning from an information-theoretic standpoint and present a
unified framework that formalizes the \textit{self-supervised
information-theoretic learning problem}. We integrate existing research into a
coherent framework, examine recent self-supervised methods, and identify
research opportunities and challenges. Moreover, we discuss empirical
measurement of information-theoretic quantities and their estimators. This
paper offers a comprehensive review of the intersection between information
theory, self-supervised learning, and deep neural networks
Reducing Spurious Correlations for Aspect-Based Sentiment Analysis with Variational Information Bottleneck and Contrastive Learning
Deep learning techniques have dominated the literature on aspect-based
sentiment analysis (ABSA), yielding state-of-the-art results. However, these
deep models generally suffer from spurious correlation problems between input
features and output labels, which creates significant barriers to robustness
and generalization capability. In this paper, we propose a novel Contrastive
Variational Information Bottleneck framework (called CVIB) to reduce spurious
correlations for ABSA. The proposed CVIB framework is composed of an original
network and a self-pruned network, and these two networks are optimized
simultaneously via contrastive learning. Concretely, we employ the Variational
Information Bottleneck (VIB) principle to learn an informative and compressed
network (self-pruned network) from the original network, which discards the
superfluous patterns or spurious correlations between input features and
prediction labels. Then, self-pruning contrastive learning is devised to pull
together semantically similar positive pairs and push away dissimilar pairs,
where the representations of the anchor learned by the original and self-pruned
networks respectively are regarded as a positive pair while the representations
of two different sentences within a mini-batch are treated as a negative pair.
To verify the effectiveness of our CVIB method, we conduct extensive
experiments on five benchmark ABSA datasets and the experimental results show
that our approach achieves better performance than the strong competitors in
terms of overall prediction performance, robustness, and generalization
- …