25 research outputs found
Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods
Training neural networks is a challenging non-convex optimization problem,
and backpropagation or gradient descent can get stuck in spurious local optima.
We propose a novel algorithm based on tensor decomposition for guaranteed
training of two-layer neural networks. We provide risk bounds for our proposed
method, with a polynomial sample complexity in the relevant parameters, such as
input dimension and number of neurons. While learning arbitrary target
functions is NP-hard, we provide transparent conditions on the function and the
input for learnability. Our training method is based on tensor decomposition,
which provably converges to the global optimum, under a set of mild
non-degeneracy conditions. It consists of simple embarrassingly parallel linear
and multi-linear operations, and is competitive with standard stochastic
gradient descent (SGD), in terms of computational complexity. Thus, we propose
a computationally efficient method with guaranteed risk bounds for training
neural networks with one hidden layer.Comment: The tensor decomposition analysis is expanded, and the analysis of
ridge regression is added for recovering the parameters of last layer of
neural networ
Tensor networks for quantum machine learning
Once developed for quantum theory, tensor networks have been established as a
successful machine learning paradigm. Now, they have been ported back to the
quantum realm in the emerging field of quantum machine learning to assess
problems that classical computers are unable to solve efficiently. Their nature
at the interface between physics and machine learning makes tensor networks
easily deployable on quantum computers. In this review article, we shed light
on one of the major architectures considered to be predestined for variational
quantum machine learning. In particular, we discuss how layouts like MPS, PEPS,
TTNs and MERA can be mapped to a quantum computer, how they can be used for
machine learning and data encoding and which implementation techniques improve
their performance