61 research outputs found
Adaptive Estimators Show Information Compression in Deep Neural Networks
To improve how neural networks function it is crucial to understand their
learning process. The information bottleneck theory of deep learning proposes
that neural networks achieve good generalization by compressing their
representations to disregard information that is not relevant to the task.
However, empirical evidence for this theory is conflicting, as compression was
only observed when networks used saturating activation functions. In contrast,
networks with non-saturating activation functions achieved comparable levels of
task performance but did not show compression. In this paper we developed more
robust mutual information estimation techniques, that adapt to hidden activity
of neural networks and produce more sensitive measurements of activations from
all functions, especially unbounded functions. Using these adaptive estimation
techniques, we explored compression in networks with a range of different
activation functions. With two improved methods of estimation, firstly, we show
that saturation of the activation function is not required for compression, and
the amount of compression varies between different activation functions. We
also find that there is a large amount of variation in compression between
different network initializations. Secondary, we see that L2 regularization
leads to significantly increased compression, while preventing overfitting.
Finally, we show that only compression of the last layer is positively
correlated with generalization.Comment: Accepted as a poster presentation at ICLR 2019 and reviewed on
OpenReview (available at https://openreview.net/forum?id=SkeZisA5t7). Pages:
11. Figures:
Electronic structure tuning via surface modification in semimetallic nanowires
Electronic structure properties of nanowires (NWs) with diameters of 1.5 and 3 nm based on semimetallic α â Sn are investigated by employing density functional theory and perturbative GW methods. We explore the dependence of electron affinity, band structure, and band-gap values with crystallographic orientation, NW crosssectional size, and surface passivants of varying electronegativity. We consider four chemical terminations in our study: methyl (CH3), hydrogen (H), hydroxyl (OH), and fluorine (F). Results suggest a high degree of elasticity of Sn-Sn bonds within the Sn NWsâ cores with no significant structural variations for nanowires with different surface passivants. Direct band gaps at Brillouin-zone centers are found for most studied structures with quasiparticle corrected band-gap magnitudes ranging from 0.25 to 3.54 eV in 1.5-nm-diameter structures, indicating an exceptional range of properties for semimetal NWs below the semimetal-to-semiconductor transition. Band-gap variations induced by changes in surface passivants indicate the possibility of realizing semimetal-semiconductor interfaces in NWs with constant cross-section and crystallographic orientation, allowing the design of novel dopant-free NW-based electronic devices
Signatures of Bayesian inference emerge from energy efficient synapses
Biological synaptic transmission is unreliable, and this unreliability likely
degrades neural circuit performance. While there are biophysical mechanisms
that can increase reliability, for instance by increasing vesicle release
probability, these mechanisms cost energy. We examined four such mechanisms
along with the associated scaling of the energetic costs. We then embedded
these energetic costs for reliability in artificial neural networks (ANN) with
trainable stochastic synapses, and trained these networks on standard image
classification tasks. The resulting networks revealed a tradeoff between
circuit performance and the energetic cost of synaptic reliability.
Additionally, the optimised networks exhibited two testable predictions
consistent with pre-existing experimental data. Specifically, synapses with
lower variability tended to have 1) higher input firing rates and 2) lower
learning rates. Surprisingly, these predictions also arise when synapse
statistics are inferred through Bayesian inference. Indeed, we were able to
find a formal, theoretical link between the performance-reliability cost
tradeoff and Bayesian inference. This connection suggests two incompatible
possibilities: evolution may have chanced upon a scheme for implementing
Bayesian inference by optimising energy efficiency, or alternatively, energy
efficient synapses may display signatures of Bayesian inference without
actually using Bayes to reason about uncertainty.Comment: 29 pages, 11 figure
Topological and simplicial features in reservoir computing networks
Reservoir computing is a framework which uses the nonlinearinternal dynamics of a recurrent neural network to perform complexnon-linear transformations of the input. This enables reservoirs tocarry out a variety of tasks involving the processing of time-dependent orsequential-based signals. Reservoirs are particularly suited for tasks thatrequire memory or the handling of temporal sequences, common in areassuch as speech recognition, time series prediction, and signal processing.Learning is restricted to the output layer and can be thought of asâreading outâ or âselecting fromâ the states of the reservoir. With all butthe output weights fixed they do not have the costly and difficult trainingassociated with deep neural networks. However, while the reservoircomputing framework shows a lot of promise in terms of efficiency andcapability, it can be unreliable. Existing studies show that small changesin hyperparameters can markedly affect the networkâs performance. Herewe studied the role of network topologies in reservoir computing in thecarrying out of three conceptually different tasks: working memory, perceptualdecision making, and chaotic time-series prediction. We implementedthree different network topologies (ring, lattice, and random)and tested reservoir network performances on the tasks. We then usedalgebraic topological tools of directed simplicial cliques to study deeperconnections between network topology and function, making comparisonsacross performance and linking with existing reservoir research
Random and biological network connectivity for reservoir computing:Random Reservoirs Rule! (at Remembering)
Reservoir computing is a framework where a fixed recurrent neural network (RNN) is used to process input signals and perform computations. Reservoirs are typically randomly initialised, but it is not fully known how connectivity affects performance, and whether particular structures might yield advantages on specific or generic tasks. Simpler topologies often perform equally well as more complex networks on prediction tasks. We check performance differences of reservoirs on four task types using the connectomes of C. elegans and drosophila larval mushroom body in comparison with varying degrees of randomisation
Signatures of Bayesian inference emerge from energy-efficient synapses
Biological synaptic transmission is unreliable, and this unreliability likely degrades neural circuit performance. While there are biophysical mechanisms that can increase reliability, for instance by increasing vesicle release probability, these mechanisms cost energy. We examined four such mechanisms along with the associated scaling of the energetic costs. We then embedded these energetic costs for reliability in artificial neural networks (ANNs) with trainable stochastic synapses, and trained these networks on standard image classification tasks. The resulting networks revealed a tradeoff between circuit performance and the energetic cost of synaptic reliability. Additionally, the optimised networks exhibited two testable predictions consistent with pre-existing experimental data. Specifically, synapses with lower variability tended to have (1) higher input firing rates and (2) lower learning rates. Surprisingly, these predictions also arise when synapse statistics are inferred through Bayesian inference. Indeed, we were able to find a formal, theoretical link between the performance-reliability cost tradeoff and Bayesian inference. This connection suggests two incompatible possibilities: evolution may have chanced upon a scheme for implementing Bayesian inference by optimising energy efficiency, or alternatively, energy-efficient synapses may display signatures of Bayesian inference without actually using Bayes to reason about uncertainty
- âŠ