Search CORE

323 research outputs found

Orthogonally Decoupled Variational Gaussian Processes

Author: Boots Byron
Cheng Ching-An
Deisenroth Marc
Salimbeni Hugh
Publication venue
Publication date: 05/09/2018
Field of study

Gaussian processes (GPs) provide a powerful non-parametric framework for reasoning over functions. Despite appealing theory, its superlinear computational and memory complexities have presented a long-standing challenge. State-of-the-art sparse variational inference methods trade modeling accuracy against complexity. However, the complexities of these methods still scale superlinearly in the number of basis functions, implying that that sparse GP methods are able to learn from large datasets only when a small model is used. Recently, a decoupled approach was proposed that removes the unnecessary coupling between the complexities of modeling the mean and the covariance functions of a GP. It achieves a linear complexity in the number of mean parameters, so an expressive posterior mean function can be modeled. While promising, this approach suffers from optimization difficulties due to ill-conditioning and non-convexity. In this work, we propose an alternative decoupled parametrization. It adopts an orthogonal basis in the mean function to model the residues that cannot be learned by the standard coupled approach. Therefore, our method extends, rather than replaces, the coupled approach to achieve strictly better performance. This construction admits a straightforward natural gradient update rule, so the structure of the information manifold that is lost during decoupling can be leveraged to speed up learning. Empirically, our algorithm demonstrates significantly faster convergence in multiple experiments.Comment: Appearing NIPS 201

arXiv.org e-Print Archive

UCL Discovery

Spiral - Imperial College Digital Repository

Orthogonally Decoupled Variational Gaussian Processes

Author: Boots B
Cheng C-A
Deisenroth M
Salimbeni H
Publication venue: 32nd Conference on Neural Information Processing Systems (NIPS)
Publication date: 08/12/2018
Field of study

UCL Discovery

Multi-component optical solitary waves

Author: Agranovich
Agrawal
Andrey A Sukhorukov
Assanto
Baldi
Balslev Clausen
Balslev Clausen
Bergman
Boardman
Boardman
Buryak
Buryak
Buryak
Carl Balslev Clausen
Chen
Dmitriev
Elena A Ostrovskaya
Fejer
Gellermann
Gredeskul
Hasegawa
Hasegawa
Hasegawa
Hopkins
Kivshar
Kivshar
Kivshar
Kivshar
Koynov
Malomed
Menyuk
Mitchell
Mitchell
Ole Bang
Pelinovsky
Peter L Christiansen
Pfister
Schechtman
Solomon M Saltiel
Steblina
Stegeman
Sulem
Tristram J Alexander
Yeh
Yeh
Yuri S Kivshar
Zhu
Zhu
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2000
Field of study

We discuss several novel types of multi-component (temporal and spatial) envelope solitary waves that appear in fiber and waveguide nonlinear optics. In particular, we describe multi-channel solitary waves in bit-parallel-wavelength fiber transmission systems for high performance computer networks, multi-colour parametric spatial solitary waves due to cascaded nonlinearities of quadratic materials, and quasiperiodic envelope solitons due to quasi-phase-matching in Fibonacci optical superlattices.Comment: 12 pages, 11 figures; To be published in: Proceedings of the Dynamics Days Asia-Pacific: First International Conference on Nonlinear Science (Hong-Kong, 13-16 July, 1999), Editor: Bambi Hu (Elsevier Publishers, 2000

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University

Online Research Database In Technology

Large-Scale Gaussian Processes via Alternating Projection

Author: Gardner Jacob R.
Jones Haydn
Pleiss Geoff
Wenger Jonathan
Wu Kaiwen
Publication venue
Publication date: 26/10/2023
Field of study

Gaussian process (GP) hyperparameter optimization requires repeatedly solving linear systems with

n \times n

kernel matrices. To address the prohibitive

\mathcal{O}(n^3)

time complexity, recent work has employed fast iterative numerical methods, like conjugate gradients (CG). However, as datasets increase in magnitude, the corresponding kernel matrices become increasingly ill-conditioned and still require

\mathcal{O}(n^2)

space without partitioning. Thus, while CG increases the size of datasets GPs can be trained on, modern datasets reach scales beyond its applicability. In this work, we propose an iterative method which only accesses subblocks of the kernel matrix, effectively enabling \emph{mini-batching}. Our algorithm, based on alternating projection, has

\mathcal{O}(n)

per-iteration time and space complexity, solving many of the practical challenges of scaling GPs to very large datasets. Theoretically, we prove our method enjoys linear convergence and empirically we demonstrate its robustness to ill-conditioning. On large-scale benchmark datasets up to four million datapoints our approach accelerates training by a factor of 2

\times

to 27

\times

compared to CG

arXiv.org e-Print Archive