169 research outputs found

    Self Normalizing Flows

    Get PDF
    Efficient gradient computation of the Jacobian determinant term is a core problem in many machine learning settings, and especially so in the normalizing flow framework. Most proposed flow models therefore either restrict to a function class with easy evaluation of the Jacobian determinant, or an efficient estimator thereof. However, these restrictions limit the performance of such density models, frequently requiring significant depth to reach desired performance levels. In this work, we propose Self Normalizing Flows, a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer. This reduces the computational complexity of each layer's exact update from O(D3)\mathcal{O}(D^3) to O(D2)\mathcal{O}(D^2), allowing for the training of flow architectures which were otherwise computationally infeasible, while also providing efficient sampling. We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts, while training more quickly and surpassing the performance of functionally constrained counterparts

    Scaling up learning with GAIT-prop

    Full text link
    Backpropagation of error (BP) is a widely used and highly successful learning algorithm. However, its reliance on non-local information in propagating error gradients makes it seem an unlikely candidate for learning in the brain. In the last decade, a number of investigations have been carried out focused upon determining whether alternative more biologically plausible computations can be used to approximate BP. This work builds on such a local learning algorithm - Gradient Adjusted Incremental Target Propagation (GAIT-prop) - which has recently been shown to approximate BP in a manner which appears biologically plausible. This method constructs local, layer-wise weight update targets in order to enable plausible credit assignment. However, in deep networks, the local weight updates computed by GAIT-prop can deviate from BP for a number of reasons. Here, we provide and test methods to overcome such sources of error. In particular, we adaptively rescale the locally-computed errors and show that this significantly increases the performance and stability of the GAIT-prop algorithm when applied to the CIFAR-10 dataset

    Minimizing Control for Credit Assignment with Strong Feedback

    Full text link
    The success of deep learning ignited interest in whether the brain learns hierarchical representations using gradient-based learning. However, current biologically plausible methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals, which is problematic in biologically realistic noisy environments and at odds with experimental evidence in neuroscience showing that top-down feedback can significantly influence neural activity. Building upon deep feedback control (DFC), a recently proposed credit assignment method, we combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization. Instead of gradually changing the network weights towards configurations with low output loss, weight updates gradually minimize the amount of feedback required from a controller that drives the network to the supervised output label. Moreover, we show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using learning rules fully local in space and time. We complement our theoretical results with experiments on standard computer-vision benchmarks, showing competitive performance to backpropagation as well as robustness to noise. Overall, our work presents a fundamentally novel view of learning as control minimization, while sidestepping biologically unrealistic assumptions

    What's in a Prior? Learned Proximal Networks for Inverse Problems

    Full text link
    Proximal operators are ubiquitous in inverse problems, commonly appearing as part of algorithmic strategies to regularize problems that are otherwise ill-posed. Modern deep learning models have been brought to bear for these tasks too, as in the framework of plug-and-play or deep unrolling, where they loosely resemble proximal operators. Yet, something essential is lost in employing these purely data-driven approaches: there is no guarantee that a general deep network represents the proximal operator of any function, nor is there any characterization of the function for which the network might provide some approximate proximal. This not only makes guaranteeing convergence of iterative schemes challenging but, more fundamentally, complicates the analysis of what has been learned by these networks about their training data. Herein we provide a framework to develop learned proximal networks (LPN), prove that they provide exact proximal operators for a data-driven nonconvex regularizer, and show how a new training strategy, dubbed proximal matching, provably promotes the recovery of the log-prior of the true data distribution. Such LPN provide general, unsupervised, expressive proximal operators that can be used for general inverse problems with convergence guarantees. We illustrate our results in a series of cases of increasing complexity, demonstrating that these models not only result in state-of-the-art performance, but provide a window into the resulting priors learned from data

    System Security Metrics via Power Simulation for VLSI Designs

    Get PDF
    Power side-channel attacks are a growing concern as they allow attackers to extract sensitive information from digital systems with low-cost equipment and minimal knowledge about a device’s inner functions. Though countermeasures are available to ASIC designers, these do not completely guarantee side-channel security, and therefore must be validated in the lab post-fabrication. The goal of this project is to verify the efficacy of simulation tools PSCARE & GLIFT to perform simulated power side-channel attacks upon such designs. Verification will be done via comparison of simulations of Advanced Encryption Standard to corresponding measurements of physical implementations on a SASEBO. Successful verification will allow for simulation of power side-channel information leakage at design-time

    Laser diffraction particle sizing : sampling and inversion

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution August 13, 1987The inverse problem of obtaining particle size distributions from observations of the angular distribution of near forward scattered light is reexamined. Asymptotic analysis of the forward problem reveals the information content of the observations, and the sources of non-uniqueness and instability in inverting them. A sampling criterion, such that the observations uniquely specify the size distribution is derived, in terms of the largest particle size, and an angle above which the intensity is indistinguishable from an asymptote. The instability of inverting unevenly spaced data is compared to that of super-resolving Fourier spectra. Resolution is shown to be inversely proportional to the angular range of observations. The problem is rephrased so that the size weighted number density is sought from the intensity weighted by the scattering angle cubed. Algorithms which impose positivity and bounds on particle size improve the stability of inversions. The forward problem can be represented by an over-determined matrix equation by choosing a large integration increment in size dependent on the frequency content of the angular intensity, further improving stability. Experimental data obtained using a linear CCD array illustrates the theory, with standard polystyrene spheres as scatterers. The scattering from single and tri-modal distributions is successfully inverted.I was supported by a NASA Technology Transfer Traineeship grant (NGT-014-800, Supplement 5), and by the Joint Program in Oceanographic Engineering between the Woods Hole Oceanographic Institution and the Massachusetts Institute of Technology. The experimental work was funded by the Coastal Research Laboratory of the Woods Hole Oceanographic Institution, through the generosity of the Mellon Foundation

    State-Space Approaches to Ultra-Wideband Doppler Processing

    Get PDF
    National security needs dictate the development of new radar systems capable of identifying and tracking exoatmospheric threats to aid our defense. These new radar systems feature reduced noise floors, electronic beam steering, and ultra-wide bandwidths, all of which facilitate threat discrimination. However, in order to identify missile attributes such as RF reflectivity, distance, and velocity, many existing processing algorithms rely upon narrow bandwidth assumptions that break down with increased signal bandwidth. We present a fresh investigation into these algorithms for removing bandwidth limitations and propose novel state-space and direct-data factoring formulations such as * the multidimensional extension to the Eigensystem Realization Algorithm, * employing state-space models in place of interpolation to obtain a form which admits a separation and isolation of solution components, * and side-stepping the joint diagonalization of state transition matrices, which commonly plagues methods like multidimensional ESPRIT. We then benchmark our approaches and relate the outcomes to the Cramer-Rao bound for the case of one and two adjacent reflectors to validate their conceptual design and identify those techniques that compare favorably to or improve upon existing practices
    • …
    corecore