2,335 research outputs found
Training Neural Networks with Stochastic Hessian-Free Optimization
Hessian-free (HF) optimization has been successfully used for training deep
autoencoders and recurrent networks. HF uses the conjugate gradient algorithm
to construct update directions through curvature-vector products that can be
computed on the same order of time as gradients. In this paper we exploit this
property and study stochastic HF with gradient and curvature mini-batches
independent of the dataset size. We modify Martens' HF for these settings and
integrate dropout, a method for preventing co-adaptation of feature detectors,
to guard against overfitting. Stochastic Hessian-free optimization gives an
intermediary between SGD and HF that achieves competitive performance on both
classification and deep autoencoder experiments.Comment: 11 pages, ICLR 201
Reservoir Computing via Quantum Recurrent Neural Networks
Recent developments in quantum computing and machine learning have propelled
the interdisciplinary study of quantum machine learning. Sequential modeling is
an important task with high scientific and commercial value. Existing VQC or
QNN-based methods require significant computational resources to perform the
gradient-based optimization of a larger number of quantum circuit parameters.
The major drawback is that such quantum gradient calculation requires a large
amount of circuit evaluation, posing challenges in current near-term quantum
hardware and simulation software. In this work, we approach sequential modeling
by applying a reservoir computing (RC) framework to quantum recurrent neural
networks (QRNN-RC) that are based on classical RNN, LSTM and GRU. The main idea
to this RC approach is that the QRNN with randomly initialized weights is
treated as a dynamical system and only the final classical linear layer is
trained. Our numerical simulations show that the QRNN-RC can reach results
comparable to fully trained QRNN models for several function approximation and
time series prediction tasks. Since the QRNN training complexity is
significantly reduced, the proposed model trains notably faster. In this work
we also compare to corresponding classical RNN-based RC implementations and
show that the quantum version learns faster by requiring fewer training epochs
in most cases. Our results demonstrate a new possibility to utilize quantum
neural network for sequential modeling with greater quantum hardware
efficiency, an important design consideration for noisy intermediate-scale
quantum (NISQ) computers
Bayesian polynomial neural networks and polynomial neural ordinary differential equations
Symbolic regression with polynomial neural networks and polynomial neural
ordinary differential equations (ODEs) are two recent and powerful approaches
for equation recovery of many science and engineering problems. However, these
methods provide point estimates for the model parameters and are currently
unable to accommodate noisy data. We address this challenge by developing and
validating the following Bayesian inference methods: the Laplace approximation,
Markov Chain Monte Carlo (MCMC) sampling methods, and variational inference. We
have found the Laplace approximation to be the best method for this class of
problems. Our work can be easily extended to the broader class of symbolic
neural networks to which the polynomial neural network belongs
- …