27 research outputs found
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
We provide theoretical convergence guarantees for score-based generative
models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which
constitute the backbone of large-scale real-world generative models such as
DALLE 2. Our main result is that, assuming accurate score estimates,
such SGMs can efficiently sample from essentially any realistic data
distribution. In contrast to prior works, our results (1) hold for an
-accurate score estimate (rather than -accurate); (2) do not
require restrictive functional inequality conditions that preclude substantial
non-log-concavity; (3) scale polynomially in all relevant problem parameters;
and (4) match state-of-the-art complexity guarantees for discretization of the
Langevin diffusion, provided that the score error is sufficiently small. We
view this as strong theoretical justification for the empirical success of
SGMs. We also examine SGMs based on the critically damped Langevin diffusion
(CLD). Contrary to conventional wisdom, we provide evidence that the use of the
CLD does not reduce the complexity of SGMs.Comment: 30 page
Quantum advantage in learning from experiments
Quantum technology has the potential to revolutionize how we acquire and
process experimental data to learn about the physical world. An experimental
setup that transduces data from a physical system to a stable quantum memory,
and processes that data using a quantum computer, could have significant
advantages over conventional experiments in which the physical system is
measured and the outcomes are processed using a classical computer. We prove
that, in various tasks, quantum machines can learn from exponentially fewer
experiments than those required in conventional experiments. The exponential
advantage holds in predicting properties of physical systems, performing
quantum principal component analysis on noisy states, and learning approximate
models of physical dynamics. In some tasks, the quantum processing needed to
achieve the exponential advantage can be modest; for example, one can
simultaneously learn about many noncommuting observables by processing only two
copies of the system. Conducting experiments with up to 40 superconducting
qubits and 1300 quantum gates, we demonstrate that a substantial quantum
advantage can be realized using today's relatively noisy quantum processors.
Our results highlight how quantum technology can enable powerful new strategies
to learn about nature.Comment: 6 pages, 17 figures + 46 page appendix; open-source code available at
https://github.com/quantumlib/ReCirq/tree/master/recirq/qml_lf
Learning (Very) Simple Generative Models Is Hard
Motivated by the recent empirical successes of deep generative models, we
study the computational complexity of the following unsupervised learning
problem. For an unknown neural network , let
be the distribution over given by pushing the standard
Gaussian through . Given i.i.d. samples from
, the goal is to output any distribution close to in statistical
distance. We show under the statistical query (SQ) model that no
polynomial-time algorithm can solve this problem even when the output
coordinates of are one-hidden-layer ReLU networks with neurons.
Previously, the best lower bounds for this problem simply followed from lower
bounds for supervised learning and required at least two hidden layers and
neurons [Daniely-Vardi '21, Chen-Gollakota-Klivans-Meka
'22]. The key ingredient in our proof is an ODE-based construction of a
compactly supported, piecewise-linear function with polynomially-bounded
slopes such that the pushforward of under matches all
low-degree moments of .Comment: 24 pages, 2 figure
Transfer learning algorithm for image classification task and its convergence analysis
Theoretical analysis of transfer learning of the
deep neural networks (DNN) is crucial in ensuring stability or
convergence and gaining a better understanding of the networks
for further development. However, most current transfer learning
methods are black-box approaches that are more focused
on empirical studies. This paper develops a transfer learning
algorithm for deep convolutional neural networks (CNN) with
batch normalization layers. A convergence-guaranteed transfer
learning algorithm is proposed to train the classifier of a deep
CNN with pretrained convolutional layers. Two classification case
studies based on VGG11 with the MNIST dataset and CIFAR10
dataset are presented to demonstrate the performance of the
proposed approach and explore the effect of batch normalization
layers on transfer learning.Ministry of Education (MOE)Submitted/Accepted versionThis work was supported by the Ministry of Education (MOE) Singapore, Academic Research Fund (AcRF) Tier 1, under Grant RG65/22