392,254 research outputs found
On Network Science and Mutual Information for Explaining Deep Neural Networks
In this paper, we present a new approach to interpret deep learning models.
By coupling mutual information with network science, we explore how information
flows through feedforward networks. We show that efficiently approximating
mutual information allows us to create an information measure that quantifies
how much information flows between any two neurons of a deep learning model. To
that end, we propose NIF, Neural Information Flow, a technique for codifying
information flow that exposes deep learning model internals and provides
feature attributions.Comment: ICASSP 2020 (shorter version appeared at AAAI-19 Workshop on Network
Interpretability for Deep Learning
Mutual Exclusivity Loss for Semi-Supervised Deep Learning
In this paper we consider the problem of semi-supervised learning with deep
Convolutional Neural Networks (ConvNets). Semi-supervised learning is motivated
on the observation that unlabeled data is cheap and can be used to improve the
accuracy of classifiers. In this paper we propose an unsupervised
regularization term that explicitly forces the classifier's prediction for
multiple classes to be mutually-exclusive and effectively guides the decision
boundary to lie on the low density space between the manifolds corresponding to
different classes of data. Our proposed approach is general and can be used
with any backpropagation-based learning method. We show through different
experiments that our method can improve the object recognition performance of
ConvNets using unlabeled data.Comment: 5 pages, 1 figures, ICIP 201
Deep Mutual Learning
Model distillation is an effective and widely used technique to transfer
knowledge from a teacher to a student network. The typical application is to
transfer from a powerful large network or ensemble to a small network, that is
better suited to low-memory or fast execution requirements. In this paper, we
present a deep mutual learning (DML) strategy where, rather than one way
transfer between a static pre-defined teacher and a student, an ensemble of
students learn collaboratively and teach each other throughout the training
process. Our experiments show that a variety of network architectures benefit
from mutual learning and achieve compelling results on CIFAR-100 recognition
and Market-1501 person re-identification benchmarks. Surprisingly, it is
revealed that no prior powerful teacher network is necessary -- mutual learning
of a collection of simple student networks works, and moreover outperforms
distillation from a more powerful yet static teacher.Comment: 10 pages, 4 figure
- …