4,669 research outputs found
Bayesian Semi-supervised Learning with Graph Gaussian Processes
We propose a data-efficient Gaussian process-based Bayesian approach to the
semi-supervised learning problem on graphs. The proposed model shows extremely
competitive performance when compared to the state-of-the-art graph neural
networks on semi-supervised learning benchmark experiments, and outperforms the
neural networks in active learning experiments where labels are scarce.
Furthermore, the model does not require a validation data set for early
stopping to control over-fitting. Our model can be viewed as an instance of
empirical distribution regression weighted locally by network connectivity. We
further motivate the intuitive construction of the model with a Bayesian linear
model interpretation where the node features are filtered by an operator
related to the graph Laplacian. The method can be easily implemented by
adapting off-the-shelf scalable variational inference algorithms for Gaussian
processes.Comment: To appear in NIPS 2018 Fixed an error in Figure 2. The previous arxiv
version contains two identical sub-figure
Inference for determinantal point processes without spectral knowledge
Determinantal point processes (DPPs) are point process models that naturally
encode diversity between the points of a given realization, through a positive
definite kernel . DPPs possess desirable properties, such as exact sampling
or analyticity of the moments, but learning the parameters of kernel
through likelihood-based inference is not straightforward. First, the kernel
that appears in the likelihood is not , but another kernel related to
through an often intractable spectral decomposition. This issue is
typically bypassed in machine learning by directly parametrizing the kernel
, at the price of some interpretability of the model parameters. We follow
this approach here. Second, the likelihood has an intractable normalizing
constant, which takes the form of a large determinant in the case of a DPP over
a finite set of objects, and the form of a Fredholm determinant in the case of
a DPP over a continuous domain. Our main contribution is to derive bounds on
the likelihood of a DPP, both for finite and continuous domains. Unlike
previous work, our bounds are cheap to evaluate since they do not rely on
approximating the spectrum of a large matrix or an operator. Through usual
arguments, these bounds thus yield cheap variational inference and moderately
expensive exact Markov chain Monte Carlo inference methods for DPPs
Actually Sparse Variational Gaussian Processes
Gaussian processes (GPs) are typically criticised for their unfavourable scaling in both computational and memory requirements. For large datasets, sparse GPs reduce these demands by conditioning on a small set of inducing variables designed to summarise the data. In practice however, for large datasets requiring many inducing variables, such as low-lengthscale spatial data, even sparse GPs can become computationally expensive, limited by the number of inducing variables one can use. In this work, we propose a new class of inter-domain variational GP, constructed by projecting a GP onto a set of compactly supported B-spline basis functions. The key benefit of our approach is that the compact support of the B-spline basis functions admits the use of sparse linear algebra to significantly speed up matrix operations and drastically reduce the memory footprint. This allows us to very efficiently model fast-varying spatial phenomena with tens of thousands of inducing variables, where previous approaches failed
Actually Sparse Variational Gaussian Processes
Gaussian processes (GPs) are typically criticised for their unfavourable
scaling in both computational and memory requirements. For large datasets,
sparse GPs reduce these demands by conditioning on a small set of inducing
variables designed to summarise the data. In practice however, for large
datasets requiring many inducing variables, such as low-lengthscale spatial
data, even sparse GPs can become computationally expensive, limited by the
number of inducing variables one can use. In this work, we propose a new class
of inter-domain variational GP, constructed by projecting a GP onto a set of
compactly supported B-spline basis functions. The key benefit of our approach
is that the compact support of the B-spline basis functions admits the use of
sparse linear algebra to significantly speed up matrix operations and
drastically reduce the memory footprint. This allows us to very efficiently
model fast-varying spatial phenomena with tens of thousands of inducing
variables, where previous approaches failed.Comment: 14 pages, 5 figures, published in AISTATS 202
- …