94 research outputs found
Knowledge Transfer with Jacobian Matching
Classical distillation methods transfer representations from a "teacher"
neural network to a "student" network by matching their output activations.
Recent methods also match the Jacobians, or the gradient of output activations
with the input. However, this involves making some ad hoc decisions, in
particular, the choice of the loss function.
In this paper, we first establish an equivalence between Jacobian matching
and distillation with input noise, from which we derive appropriate loss
functions for Jacobian matching. We then rely on this analysis to apply
Jacobian matching to transfer learning by establishing equivalence of a recent
transfer learning procedure to distillation.
We then show experimentally on standard image datasets that Jacobian-based
penalties improve distillation, robustness to noisy inputs, and transfer
learning
Full-Gradient Representation for Neural Network Visualization
We introduce a new tool for interpreting neural net responses, namely
full-gradients, which decomposes the neural net response into input sensitivity
and per-neuron sensitivity components. This is the first proposed
representation which satisfies two key properties: completeness and weak
dependence, which provably cannot be satisfied by any saliency map-based
interpretability method. For convolutional nets, we also propose an approximate
saliency map representation, called FullGrad, obtained by aggregating the
full-gradient components.
We experimentally evaluate the usefulness of FullGrad in explaining model
behaviour with two quantitative tests: pixel perturbation and
remove-and-retrain. Our experiments reveal that our method explains model
behaviour correctly, and more comprehensively than other methods in the
literature. Visual inspection also reveals that our saliency maps are sharper
and more tightly confined to object regions than other methods.Comment: NeurIPS 201
Fair Latency-Aware Metric for real-time video segmentation networks
As supervised semantic segmentation is reaching satisfying results, many
recent papers focused on making segmentation network architectures faster,
smaller and more efficient. In particular, studies often aim to reach the stage
to which they can claim to be "real-time". Achieving this goal is especially
relevant in the context of real-time video operations for autonomous vehicles
and robots, or medical imaging during surgery.
The common metric used for assessing these methods is so far the same as the
ones used for image segmentation without time constraint: mean Intersection
over Union (mIoU). In this paper, we argue that this metric is not relevant
enough for real-time video as it does not take into account the processing time
(latency) of the network. We propose a similar but more relevant metric called
FLAME for video-segmentation networks, that compares the output segmentation of
the network with the ground truth segmentation of the current video frame at
the time when the network finishes the processing.
We perform experiments to compare a few networks using this metric and
propose a simple addition to network training to enhance results according to
that metric
SequeL: A Continual Learning Library in PyTorch and JAX
Continual Learning is an important and challenging problem in machine
learning, where models must adapt to a continuous stream of new data without
forgetting previously acquired knowledge. While existing frameworks are built
on PyTorch, the rising popularity of JAX might lead to divergent codebases,
ultimately hindering reproducibility and progress. To address this problem, we
introduce SequeL, a flexible and extensible library for Continual Learning that
supports both PyTorch and JAX frameworks. SequeL provides a unified interface
for a wide range of Continual Learning algorithms, including
regularization-based approaches, replay-based approaches, and hybrid
approaches. The library is designed towards modularity and simplicity, making
the API suitable for both researchers and practitioners. We release
SequeL\footnote{\url{https://github.com/nik-dim/sequel}} as an open-source
library, enabling researchers and developers to easily experiment and extend
the library for their own purposes.Comment: 7 pages, 1 figure, 4 code listing
Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching
End-to-end deep-learning networks recently demonstrated extremely good
perfor- mance for stereo matching. However, existing networks are difficult to
use for practical applications since (1) they are memory-hungry and unable to
process even modest-size images, (2) they have to be trained for a given
disparity range. The Practical Deep Stereo (PDS) network that we propose
addresses both issues: First, its architecture relies on novel bottleneck
modules that drastically reduce the memory footprint in inference, and
additional design choices allow to handle greater image size during training.
This results in a model that leverages large image context to resolve matching
ambiguities. Second, a novel sub-pixel cross- entropy loss combined with a MAP
estimator make this network less sensitive to ambiguous matches, and applicable
to any disparity range without re-training. We compare PDS to state-of-the-art
methods published over the recent months, and demonstrate its superior
performance on FlyingThings3D and KITTI sets
Taming GANs with Lookahead
Generative Adversarial Networks are notoriously challenging to train. The
underlying minimax optimization is highly susceptible to the variance of the
stochastic gradient and the rotational component of the associated game vector
field. We empirically demonstrate the effectiveness of the Lookahead
meta-optimization method for optimizing games, originally proposed for standard
minimization. The backtracking step of Lookahead naturally handles the
rotational game dynamics, which in turn enables the gradient ascent descent
method to converge on challenging toy games often analyzed in the literature.
Moreover, it implicitly handles high variance without using large mini-batches,
known to be essential for reaching state of the art performance. Experimental
results on MNIST, SVHN, and CIFAR-10, demonstrate a clear advantage of
combining Lookahead with Adam or extragradient, in terms of performance, memory
footprint, and improved stability. Using 30-fold fewer parameters and 16-fold
smaller minibatches we outperform the reported performance of the
class-dependent BigGAN on CIFAR-10 by obtaining FID of \emph{without}
using the class labels, bringing state-of-the-art GAN training within reach of
common computational resources
Geometric calibration of Colour and Stereo Surface Imaging System of ESA's Trace Gas Orbiter
There are many geometric calibration methods for "standard" cameras. These
methods, however, cannot be used for the calibration of telescopes with large
focal lengths and complex off-axis optics. Moreover, specialized calibration
methods for the telescopes are scarce in literature. We describe the
calibration method that we developed for the Colour and Stereo Surface Imaging
System (CaSSIS) telescope, on board of the ExoMars Trace Gas Orbiter (TGO).
Although our method is described in the context of CaSSIS, with camera-specific
experiments, it is general and can be applied to other telescopes. We further
encourage re-use of the proposed method by making our calibration code and data
available on-line.Comment: Submitted to Advances in Space Researc
- …