Search CORE

315 research outputs found

Fraud on transfer and on insolvency: <i>ta... ta... tantum et tale</i>?

Author: Gretton G L
Gretton G L
Kelbrick R
Paisley R R M
Ross Gilbert Anderson
Publication venue: 'Edinburgh University Press'
Publication date: 01/05/2007
Field of study

Crossref

Enlighten

Scottish share pledges in the Supreme Court

Author: Aluminum Company of America v Essex Group Inc
Blackshaw A
Combe M M
Daube D
Duff P
Gretton G L
Gretton G L
Gretton G L
Leverick F
Leverick F
Orr N W
Plaxton M C
Russell E J
Stark F
Publication venue: 'Edinburgh University Press'
Publication date: 01/01/2012
Field of study

Crossref

Edinburgh Research Explorer

Enlighten

On gradient regularizers for MMD GANs

Author: Arbel M
Bińkowski M
Gretton A
Sutherland DJ
Publication venue: Neural Information Processing Systems
Publication date: 10/12/2018
Field of study

We propose a principled method for gradient-based regularization of the critic of GAN-like models trained by adversarially optimizing the kernel of a Maximum Mean Discrepancy (MMD). We show that controlling the gradient of the critic is vital to having a sensible loss function, and devise a method to enforce exact, analytical gradient constraints at no additional cost compared to existing approximate techniques based on additive regularizers. The new loss function is provably continuous, and experiments show that it stabilizes and accelerates training, giving image generation models that outperform state-of-the art methods on 160 × 160 CelebA and 64 × 64 unconditional ImageNet

UCL Discovery

Antithetic and Monte Carlo kernel estimators for partial rankings

Author: Ghahramani Z
Gretton A
Lomeli M
Rowland M
Publication venue: Statistics and Computing
Publication date: 25/07/2018
Field of study

In the modern age, rankings data is ubiquitous and it is useful for a variety of applications such as recommender systems, multi-object tracking and preference learning. However, most rankings data encountered in the real world is incomplete, which prevents the direct application of existing modelling tools for complete rankings. Our contribution is a novel way to extend kernel methods for complete rankings to partial rankings, via consistent Monte Carlo estimators for Gram matrices: matrices of kernel values between pairs of observations. We also present a novel variance reduction scheme based on an antithetic variate construction between permutations to obtain an improved estimator for the Mallows kernel. The corresponding antithetic kernel estimator has lower variance and we demonstrate empirically that it has a better performance in a variety of Machine Learning tasks. Both kernel estimators are based on extending kernel mean embeddings to the embedding of a set of full rankings consistent with an observed partial ranking. They form a computationally tractable alternative to previous approaches for partial rankings data. An overview of the existing kernels and metrics for permutations is also provided

arXiv.org e-Print Archive

UCL Discovery

Apollo (Cambridge)

CUED - Cambridge University Engineering Department

Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages

Author: Eslami S. M. Ali
Gretton Arthur
Heess Nicolas
Jitkrittum Wittawat
Lakshminarayanan Balaji
Sejdinovic Dino
Szabó Zoltán
Publication venue
Publication date: 01/01/2015
Field of study

We propose an efficient nonparametric strategy for learning a message operator in expectation propagation (EP), which takes as input the set of incoming messages to a factor node, and produces an outgoing message as output. This learned operator replaces the multivariate integral required in classical EP, which may not have an analytic expression. We use kernel-based regression, which is trained on a set of probability distributions representing the incoming messages, and the associated outgoing messages. The kernel approach has two main advantages: first, it is fast, as it is implemented using a novel two-layer random feature representation of the input message distributions; second, it has principled uncertainty estimates, and can be cheaply updated online, meaning it can request and incorporate new training data when it encounters inputs on which it is uncertain. In experiments, our approach is able to solve learning problems where a single message operator is required for multiple, substantially different data sets (logistic regression for a variety of classification problems), where it is essential to accurately assess uncertainty and to efficiently and robustly update the message operator.Comment: accepted to UAI 2015. Correct typos. Add more content to the appendix. Main results unchange

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

Oxford University Research Archive

Maximum Mean Discrepancy Gradient Flow

Author: Arbel M
Gretton A
Korba A
Salim A
Publication venue: NeurIPS
Publication date: 03/12/2019
Field of study

We construct a Wasserstein gradient flow of the maximum mean discrepancy (MMD) and study its convergence properties. The MMD is an integral probability metric defined for a reproducing kernel Hilbert space (RKHS), and serves as a metric on probability measures for a sufficiently rich RKHS. We obtain conditions for convergence of the gradient flow towards a global optimum, that can be related to particle transport when optimizing neural networks. We also propose a way to regularize this MMD flow, based on an injection of noise in the gradient. This algorithmic fix comes with theoretical and empirical evidence. The practical implementation of the flow is straightforward, since both the MMD and its gradient have simple closed-form expressions, which can be easily estimated with samples

arXiv.org e-Print Archive

UCL Discovery

Multivariate Regression with Stiefel Constraints

Author: Bakir G.
Franz M.
Gretton A.
Schölkopf B.
Publication venue: Max Planck Institute for Biological Cybernetics
Publication date: 01/07/2004
Field of study

We introduce a new framework for regression between multi-dimensional spaces. Standard methods for solving this problem typically reduce the problem to one-dimensional regression by choosing features in the input and/or output spaces. These methods, which include PLS (partial least squares), KDE (kernel dependency estimation), and PCR (principal component regression), select features based on different a-priori judgments as to their relevance. Moreover, loss function and constraints are chosen not primarily on statistical grounds, but to simplify the resulting optimisation. By contrast, in our approach the feature construction and the regression estimation are performed jointly, directly minimizing a loss function that we specify, subject to a rank constraint. A major advantage of this approach is that the loss is no longer chosen according to the algorithmic requirements, but can be tailored to the characteristics of the task at hand; the features will then be optimal with respect to this objective. Our approach also allows for the possibility of using a regularizer in the optimization. Finally, by processing the observations sequentially, our algorithm is able to work on large scale problems

MPG.PuRe

Model-based kernel sum rule: kernel Bayesian inference with probabilistic model

Author: Fukumizu K
Gretton A
Kanagawa M
Nishiyama Y
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/06/2022
Field of study

Kernel Bayesian inference is a principled approach to nonparametric inference in probabilistic graphical models, where probabilistic relationships between variables are learned from data in a nonparametric manner. Various algorithms of kernel Bayesian inference have been developed by combining kernelized basic probabilistic operations such as the kernel sum rule and kernel Bayes’ rule. However, the current framework is fully nonparametric, and it does not allow a user to flexibly combine nonparametric and model-based inferences. This is inefficient when there are good probabilistic models (or simulation models) available for some parts of a graphical model; this is in particular true in scientific fields where “models” are the central topic of study. Our contribution in this paper is to introduce a novel approach, termed the model-based kernel sum rule (Mb-KSR), to combine a probabilistic model and kernel Bayesian inference. By combining the Mb-KSR with the existing kernelized probabilistic rules, one can develop various algorithms for hybrid (i.e., nonparametric and model-based) inferences. As an illustrative example, we consider Bayesian filtering in a state space model, where typically there exists an accurate probabilistic model for the state transition process. We propose a novel filtering method that combines model-based inference for the state transition process and data-driven, nonparametric inference for the observation generating process. We empirically validate our approach with synthetic and real-data experiments, the latter being the problem of vision-based mobile robot localization in robotics, which illustrates the effectiveness of the proposed hybrid approach

UCL Discovery

Recovery of non-linear cause-effect relationships from linearly mixed neuroimaging data

Author: Gretton A
Grosse-Wentrup M
Schölkopf B
Weichwald S
Publication venue: Institute of Electrical and Electronic Engineers (IEEE)
Publication date: 01/01/2016
Field of study

Causal inference concerns the identification of cause-effect relationships between variables. However, often only linear combinations of variables constitute meaningful causal variables. For example, recovering the signal of a cortical source from electroencephalography requires a well-tuned combination of signals recorded at multiple electrodes. We recently introduced the MERLiN (Mixture Effect Recovery in Linear Networks) algorithm that is able to recover, from an observed linear mixture, a causal variable that is a linear effect of another given variable. Here we relax the assumption of this cause-effect relationship being linear and present an extended algorithm that can pick up non-linear cause-effect relationships. Thus, the main contribution is an algorithm (and ready to use code) that has broader applicability and allows for a richer model class. Furthermore, a comparative analysis indicates that the assumption of linear cause-effect relationships is not restrictive in analysing electroencephalographic data

arXiv.org e-Print Archive

UCL Discovery

MPG.PuRe