15 research outputs found

    Information flows of diverse autoencoders

    Full text link
    The outstanding performance of deep learning in various fields has been a fundamental query, which can be potentially examined using information theory that interprets the learning process as the transmission and compression of information. Information plane analyses of the mutual information between the input-hidden-output layers demonstrated two distinct learning phases of fitting and compression. It is debatable if the compression phase is necessary to generalize the input-output relations extracted from training data. In this study, we investigated this through experiments with various species of autoencoders and evaluated their information processing phase with an accurate kernel-based estimator of mutual information. Given sufficient training data, vanilla autoencoders demonstrated the compression phase, which was amplified after imposing sparsity regularization for hidden activities. However, we found that the compression phase is not universally observed in different species of autoencoders, including variational autoencoders, that have special constraints on network weights or manifold of hidden space. These types of autoencoders exhibited perfect generalization ability for test data without requiring the compression phase. Thus, we conclude that the compression phase is not necessary for generalization in representation learning

    Reduced Order Modeling for Parameterized Time-Dependent PDEs using Spatially and Memory Aware Deep Learning

    Get PDF
    We present a novel reduced order model (ROM) approach for parameterized time-dependent PDEs based on modern learning. The ROM is suitable for multi-query problems and is nonintrusive. It is divided into two distinct stages: A nonlinear dimensionality reduction stage that handles the spatially distributed degrees of freedom based on convolutional autoencoders, and a parameterized time-stepping stage based on memory aware neural networks (NNs), specifically causal convolutional and long short-term memory NNs. Strategies to ensure generalization and stability are discussed. The methodology is tested on the heat equation, advection equation, and the incompressible Navier-Stokes equations, to show the variety of problems the ROM can handle

    On Information Plane Analyses of Neural Network Classifiers -- A Review

    Full text link
    We review the current literature concerned with information plane analyses of neural network classifiers. While the underlying information bottleneck theory and the claim that information-theoretic compression is causally linked to generalization are plausible, empirical evidence was found to be both supporting and conflicting. We review this evidence together with a detailed analysis of how the respective information quantities were estimated. Our survey suggests that compression visualized in information planes is not necessarily information-theoretic, but is rather often compatible with geometric compression of the latent representations. This insight gives the information plane a renewed justification. Aside from this, we shed light on the problem of estimating mutual information in deterministic neural networks and its consequences. Specifically, we argue that even in feed-forward neural networks the data processing inequality need not hold for estimates of mutual information. Similarly, while a fitting phase, in which the mutual information between the latent representation and the target increases, is necessary (but not sufficient) for good classification performance, depending on the specifics of mutual information estimation such a fitting phase need not be visible in the information plane.Comment: 12 pages, 3 figures; accepted for publication in IEEE Transactions on Neural Networks and Learning Systems. (c) 2021 IEE
    corecore