27 research outputs found

    Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

    Full text link
    We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL\cdotE 2. Our main result is that, assuming accurate score estimates, such SGMs can efficiently sample from essentially any realistic data distribution. In contrast to prior works, our results (1) hold for an L2L^2-accurate score estimate (rather than LL^\infty-accurate); (2) do not require restrictive functional inequality conditions that preclude substantial non-log-concavity; (3) scale polynomially in all relevant problem parameters; and (4) match state-of-the-art complexity guarantees for discretization of the Langevin diffusion, provided that the score error is sufficiently small. We view this as strong theoretical justification for the empirical success of SGMs. We also examine SGMs based on the critically damped Langevin diffusion (CLD). Contrary to conventional wisdom, we provide evidence that the use of the CLD does not reduce the complexity of SGMs.Comment: 30 page

    Quantum advantage in learning from experiments

    Full text link
    Quantum technology has the potential to revolutionize how we acquire and process experimental data to learn about the physical world. An experimental setup that transduces data from a physical system to a stable quantum memory, and processes that data using a quantum computer, could have significant advantages over conventional experiments in which the physical system is measured and the outcomes are processed using a classical computer. We prove that, in various tasks, quantum machines can learn from exponentially fewer experiments than those required in conventional experiments. The exponential advantage holds in predicting properties of physical systems, performing quantum principal component analysis on noisy states, and learning approximate models of physical dynamics. In some tasks, the quantum processing needed to achieve the exponential advantage can be modest; for example, one can simultaneously learn about many noncommuting observables by processing only two copies of the system. Conducting experiments with up to 40 superconducting qubits and 1300 quantum gates, we demonstrate that a substantial quantum advantage can be realized using today's relatively noisy quantum processors. Our results highlight how quantum technology can enable powerful new strategies to learn about nature.Comment: 6 pages, 17 figures + 46 page appendix; open-source code available at https://github.com/quantumlib/ReCirq/tree/master/recirq/qml_lf

    Learning (Very) Simple Generative Models Is Hard

    Full text link
    Motivated by the recent empirical successes of deep generative models, we study the computational complexity of the following unsupervised learning problem. For an unknown neural network F:RdRdF:\mathbb{R}^d\to\mathbb{R}^{d'}, let DD be the distribution over Rd\mathbb{R}^{d'} given by pushing the standard Gaussian N(0,Idd)\mathcal{N}(0,\textrm{Id}_d) through FF. Given i.i.d. samples from DD, the goal is to output any distribution close to DD in statistical distance. We show under the statistical query (SQ) model that no polynomial-time algorithm can solve this problem even when the output coordinates of FF are one-hidden-layer ReLU networks with log(d)\log(d) neurons. Previously, the best lower bounds for this problem simply followed from lower bounds for supervised learning and required at least two hidden layers and poly(d)\mathrm{poly}(d) neurons [Daniely-Vardi '21, Chen-Gollakota-Klivans-Meka '22]. The key ingredient in our proof is an ODE-based construction of a compactly supported, piecewise-linear function ff with polynomially-bounded slopes such that the pushforward of N(0,1)\mathcal{N}(0,1) under ff matches all low-degree moments of N(0,1)\mathcal{N}(0,1).Comment: 24 pages, 2 figure

    Transfer learning algorithm for image classification task and its convergence analysis

    No full text
    Theoretical analysis of transfer learning of the deep neural networks (DNN) is crucial in ensuring stability or convergence and gaining a better understanding of the networks for further development. However, most current transfer learning methods are black-box approaches that are more focused on empirical studies. This paper develops a transfer learning algorithm for deep convolutional neural networks (CNN) with batch normalization layers. A convergence-guaranteed transfer learning algorithm is proposed to train the classifier of a deep CNN with pretrained convolutional layers. Two classification case studies based on VGG11 with the MNIST dataset and CIFAR10 dataset are presented to demonstrate the performance of the proposed approach and explore the effect of batch normalization layers on transfer learning.Ministry of Education (MOE)Submitted/Accepted versionThis work was supported by the Ministry of Education (MOE) Singapore, Academic Research Fund (AcRF) Tier 1, under Grant RG65/22

    Learning Mixtures of Linear Regressions in Subexponential Time via Fourier Moments

    No full text
    corecore