15 research outputs found
Information flows of diverse autoencoders
The outstanding performance of deep learning in various fields has been a
fundamental query, which can be potentially examined using information theory
that interprets the learning process as the transmission and compression of
information. Information plane analyses of the mutual information between the
input-hidden-output layers demonstrated two distinct learning phases of fitting
and compression. It is debatable if the compression phase is necessary to
generalize the input-output relations extracted from training data. In this
study, we investigated this through experiments with various species of
autoencoders and evaluated their information processing phase with an accurate
kernel-based estimator of mutual information. Given sufficient training data,
vanilla autoencoders demonstrated the compression phase, which was amplified
after imposing sparsity regularization for hidden activities. However, we found
that the compression phase is not universally observed in different species of
autoencoders, including variational autoencoders, that have special constraints
on network weights or manifold of hidden space. These types of autoencoders
exhibited perfect generalization ability for test data without requiring the
compression phase. Thus, we conclude that the compression phase is not
necessary for generalization in representation learning
Reduced Order Modeling for Parameterized Time-Dependent PDEs using Spatially and Memory Aware Deep Learning
We present a novel reduced order model (ROM) approach for parameterized
time-dependent PDEs based on modern learning. The ROM is suitable for
multi-query problems and is nonintrusive. It is divided into two distinct
stages: A nonlinear dimensionality reduction stage that handles the spatially
distributed degrees of freedom based on convolutional autoencoders, and a
parameterized time-stepping stage based on memory aware neural networks (NNs),
specifically causal convolutional and long short-term memory NNs. Strategies to
ensure generalization and stability are discussed. The methodology is tested on
the heat equation, advection equation, and the incompressible Navier-Stokes
equations, to show the variety of problems the ROM can handle
On Information Plane Analyses of Neural Network Classifiers -- A Review
We review the current literature concerned with information plane analyses of
neural network classifiers. While the underlying information bottleneck theory
and the claim that information-theoretic compression is causally linked to
generalization are plausible, empirical evidence was found to be both
supporting and conflicting. We review this evidence together with a detailed
analysis of how the respective information quantities were estimated. Our
survey suggests that compression visualized in information planes is not
necessarily information-theoretic, but is rather often compatible with
geometric compression of the latent representations. This insight gives the
information plane a renewed justification.
Aside from this, we shed light on the problem of estimating mutual
information in deterministic neural networks and its consequences.
Specifically, we argue that even in feed-forward neural networks the data
processing inequality need not hold for estimates of mutual information.
Similarly, while a fitting phase, in which the mutual information between the
latent representation and the target increases, is necessary (but not
sufficient) for good classification performance, depending on the specifics of
mutual information estimation such a fitting phase need not be visible in the
information plane.Comment: 12 pages, 3 figures; accepted for publication in IEEE Transactions on
Neural Networks and Learning Systems. (c) 2021 IEE