169 research outputs found
Self Normalizing Flows
Efficient gradient computation of the Jacobian determinant term is a core
problem in many machine learning settings, and especially so in the normalizing
flow framework. Most proposed flow models therefore either restrict to a
function class with easy evaluation of the Jacobian determinant, or an
efficient estimator thereof. However, these restrictions limit the performance
of such density models, frequently requiring significant depth to reach desired
performance levels. In this work, we propose Self Normalizing Flows, a flexible
framework for training normalizing flows by replacing expensive terms in the
gradient by learned approximate inverses at each layer. This reduces the
computational complexity of each layer's exact update from
to , allowing for the training of flow architectures which
were otherwise computationally infeasible, while also providing efficient
sampling. We show experimentally that such models are remarkably stable and
optimize to similar data likelihood values as their exact gradient
counterparts, while training more quickly and surpassing the performance of
functionally constrained counterparts
Scaling up learning with GAIT-prop
Backpropagation of error (BP) is a widely used and highly successful learning
algorithm. However, its reliance on non-local information in propagating error
gradients makes it seem an unlikely candidate for learning in the brain. In the
last decade, a number of investigations have been carried out focused upon
determining whether alternative more biologically plausible computations can be
used to approximate BP. This work builds on such a local learning algorithm -
Gradient Adjusted Incremental Target Propagation (GAIT-prop) - which has
recently been shown to approximate BP in a manner which appears biologically
plausible. This method constructs local, layer-wise weight update targets in
order to enable plausible credit assignment. However, in deep networks, the
local weight updates computed by GAIT-prop can deviate from BP for a number of
reasons. Here, we provide and test methods to overcome such sources of error.
In particular, we adaptively rescale the locally-computed errors and show that
this significantly increases the performance and stability of the GAIT-prop
algorithm when applied to the CIFAR-10 dataset
Minimizing Control for Credit Assignment with Strong Feedback
The success of deep learning ignited interest in whether the brain learns hierarchical representations using gradient-based learning. However, current biologically plausible methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals, which is problematic in biologically realistic noisy environments and at odds with experimental evidence in neuroscience showing that top-down feedback can significantly influence neural activity. Building upon deep feedback control (DFC), a recently proposed credit assignment method, we combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization. Instead of gradually changing the network weights towards configurations with low output loss, weight updates gradually minimize the amount of feedback required from a controller that drives the network to the supervised output label. Moreover, we show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using learning rules fully local in space and time. We complement our theoretical results with experiments on standard computer-vision benchmarks, showing competitive performance to backpropagation as well as robustness to noise. Overall, our work presents a fundamentally novel view of learning as control minimization, while sidestepping biologically unrealistic assumptions
What's in a Prior? Learned Proximal Networks for Inverse Problems
Proximal operators are ubiquitous in inverse problems, commonly appearing as
part of algorithmic strategies to regularize problems that are otherwise
ill-posed. Modern deep learning models have been brought to bear for these
tasks too, as in the framework of plug-and-play or deep unrolling, where they
loosely resemble proximal operators. Yet, something essential is lost in
employing these purely data-driven approaches: there is no guarantee that a
general deep network represents the proximal operator of any function, nor is
there any characterization of the function for which the network might provide
some approximate proximal. This not only makes guaranteeing convergence of
iterative schemes challenging but, more fundamentally, complicates the analysis
of what has been learned by these networks about their training data. Herein we
provide a framework to develop learned proximal networks (LPN), prove that they
provide exact proximal operators for a data-driven nonconvex regularizer, and
show how a new training strategy, dubbed proximal matching, provably promotes
the recovery of the log-prior of the true data distribution. Such LPN provide
general, unsupervised, expressive proximal operators that can be used for
general inverse problems with convergence guarantees. We illustrate our results
in a series of cases of increasing complexity, demonstrating that these models
not only result in state-of-the-art performance, but provide a window into the
resulting priors learned from data
System Security Metrics via Power Simulation for VLSI Designs
Power side-channel attacks are a growing concern as they allow attackers to extract sensitive information from digital systems with low-cost equipment and minimal knowledge about a device’s inner functions. Though countermeasures are available to ASIC designers, these do not completely guarantee side-channel security, and therefore must be validated in the lab post-fabrication. The goal of this project is to verify the efficacy of simulation tools PSCARE & GLIFT to perform simulated power side-channel attacks upon such designs. Verification will be done via comparison of simulations of Advanced Encryption Standard to corresponding measurements of physical implementations on a SASEBO. Successful verification will allow for simulation of power side-channel information leakage at design-time
Laser diffraction particle sizing : sampling and inversion
Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution August 13, 1987The inverse problem of obtaining particle size distributions from observations
of the angular distribution of near forward scattered light is reexamined.
Asymptotic analysis of the forward problem reveals the information
content of the observations, and the sources of non-uniqueness and
instability in inverting them. A sampling criterion, such that the observations
uniquely specify the size distribution is derived, in terms of the
largest particle size, and an angle above which the intensity is indistinguishable
from an asymptote. The instability of inverting unevenly spaced
data is compared to that of super-resolving Fourier spectra. Resolution is
shown to be inversely proportional to the angular range of observations.
The problem is rephrased so that the size weighted number density is
sought from the intensity weighted by the scattering angle cubed. Algorithms
which impose positivity and bounds on particle size improve the
stability of inversions. The forward problem can be represented by an
over-determined matrix equation by choosing a large integration increment
in size dependent on the frequency content of the angular intensity, further
improving stability.
Experimental data obtained using a linear CCD array illustrates the theory, with standard polystyrene spheres as scatterers. The scattering
from single and tri-modal distributions is successfully inverted.I was supported by a NASA Technology
Transfer Traineeship grant (NGT-014-800, Supplement 5), and by
the Joint Program in Oceanographic Engineering between the Woods Hole
Oceanographic Institution and the Massachusetts Institute of Technology.
The experimental work was funded by the Coastal Research Laboratory of
the Woods Hole Oceanographic Institution, through the generosity of the
Mellon Foundation
State-Space Approaches to Ultra-Wideband Doppler Processing
National security needs dictate the development of new radar systems capable of identifying and tracking exoatmospheric threats to aid our defense. These new radar systems feature reduced noise floors, electronic beam steering, and ultra-wide bandwidths, all of which facilitate threat discrimination. However, in order to identify missile attributes such as RF reflectivity, distance, and velocity, many existing processing algorithms rely upon narrow bandwidth assumptions that break down with increased signal bandwidth. We present a fresh investigation into these algorithms for removing bandwidth limitations and propose novel state-space and direct-data factoring formulations such as * the multidimensional extension to the Eigensystem Realization Algorithm, * employing state-space models in place of interpolation to obtain a form which admits a separation and isolation of solution components, * and side-stepping the joint diagonalization of state transition matrices, which commonly plagues methods like multidimensional ESPRIT. We then benchmark our approaches and relate the outcomes to the Cramer-Rao bound for the case of one and two adjacent reflectors to validate their conceptual design and identify those techniques that compare favorably to or improve upon existing practices
- …