9 research outputs found
Efficient modeling of latent information in supervised learning using Gaussian processes
Often in machine learning, data are collected as a combination of multiple conditions, e.g., the voice recordings of multiple persons, each labeled with an ID. How could we build a model that captures the latent information related to these conditions and generalize to a new one with few data? We present a new model called Latent Variable Multiple Output Gaussian Processes (LVMOGP) that allows to jointly model multiple conditions for regression and generalize to a new condition with a few data points at test time. LVMOGP infers the posteriors of Gaussian processes together with a latent space representing the information about different conditions. We derive an efficient variational inference method for LVMOGP for which the computational complexity is as low as sparse Gaussian processes. We show that LVMOGP significantly outperforms related Gaussian process methods on various tasks with both synthetic and real data
NIPS - Not Even Wrong? A Systematic Review of Empirically Complete Demonstrations of Algorithmic Effectiveness in the Machine Learning and Artificial Intelligence Literature
Objective: To determine the completeness of argumentative steps necessary to
conclude effectiveness of an algorithm in a sample of current ML/AI supervised
learning literature.
Data Sources: Papers published in the Neural Information Processing Systems
(NeurIPS, n\'ee NIPS) journal where the official record showed a 2017 year of
publication.
Eligibility Criteria: Studies reporting a (semi-)supervised model, or
pre-processing fused with (semi-)supervised models for tabular data.
Study Appraisal: Three reviewers applied the assessment criteria to determine
argumentative completeness. The criteria were split into three groups,
including: experiments (e.g real and/or synthetic data), baselines (e.g
uninformed and/or state-of-art) and quantitative comparison (e.g. performance
quantifiers with confidence intervals and formal comparison of the algorithm
against baselines).
Results: Of the 121 eligible manuscripts (from the sample of 679 abstracts),
99\% used real-world data and 29\% used synthetic data. 91\% of manuscripts did
not report an uninformed baseline and 55\% reported a state-of-art baseline.
32\% reported confidence intervals for performance but none provided references
or exposition for how these were calculated. 3\% reported formal comparisons.
Limitations: The use of one journal as the primary information source may not
be representative of all ML/AI literature. However, the NeurIPS conference is
recognised to be amongst the top tier concerning ML/AI studies, so it is
reasonable to consider its corpus to be representative of high-quality
research.
Conclusion: Using the 2017 sample of the NeurIPS supervised learning corpus
as an indicator for the quality and trustworthiness of current ML/AI research,
it appears that complete argumentative chains in demonstrations of algorithmic
effectiveness are rare
A Kronecker product accelerated efficient sparse Gaussian Process (E-SGP) for flow emulation
In this paper, we introduce an efficient sparse Gaussian process (E-SGP) for
the surrogate modelling of fluid mechanics. This novel Bayesian machine
learning algorithm allows efficient model training using databases of different
structures. It is a further development of the approximated sparse GP
algorithm, combining the concept of efficient GP (E-GP) and variational energy
free sparse Gaussian process (VEF-SGP). The developed E-SGP approach exploits
the arbitrariness of inducing points and the monotonically increasing nature of
the objective function with respect to the number of inducing points in
VEF-SGP. By specifying the inducing points on the orthogonal grid/input
subspace and using the Kronecker product, E-SGP significantly improves
computational efficiency without imposing any constraints on the covariance
matrix or increasing the number of parameters that need to be optimised during
training.
The E-SGP algorithm developed in this paper outperforms E-GP not only in
scalability but also in model quality in terms of mean standardized logarithmic
loss (MSLL). The computational complexity of E-GP suffers from the cubic growth
regarding the growing structured training database. However, E-SGP maintains
computational efficiency whilst the resolution of the model, (i.e., the number
of inducing points) remains fixed. The examples show that E-SGP produces more
accurate predictions in comparison with E-GP when the model resolutions are
similar in both. E-GP benefits from more training data but comes with higher
computational demands, while E-SGP achieves a comparable level of accuracy but
is more computationally efficient, making E-SGP a potentially preferable choice
for fluid mechanic problems. Furthermore, E-SGP can produce more reasonable
estimates of model uncertainty, whilst E-GP is more likely to produce
over-confident predictions
Large Scale Multi-Label Learning using Gaussian Processes
We introduce a Gaussian process latent factor model for multi-label classification that can capture correlations among class labels by using a small set of latent Gaussian process functions. To address computational challenges, when the number of training instances is very large, we introduce several techniques based on variational sparse Gaussian process approximations and stochastic optimization. Specifically, we apply doubly stochastic variational inference that sub-samples data instances and classes which allows us to cope with Big Data. Furthermore, we show it is possible and beneficial to optimize over inducing points, using gradient-based methods, even in very high dimensional input spaces involving up to hundreds of thousands of dimensions. We demonstrate the usefulness of our approach on several real-world large-scale multi-label learning problems
Large scale multi-output multi-class classification using Gaussian processes
Multi-output Gaussian processes (MOGPs) can help to improve predictive performance for some output variables, by leveraging the correlation with other output variables. In this paper, our main motivation is to use multiple-output Gaussian processes to exploit correlations between outputs where each output is a multi-class classification problem. MOGPs have been mostly used for multi-output regression. There are some existing works that use MOGPs for other types of outputs, e.g., multi-output binary classification. However, MOGPs for multi-class classification has been less studied. The reason is twofold: 1) when using a softmax function, it is not clear how to scale it beyond the case of a few outputs; 2) most common type of data in multi-class classification problems consists of image data, and MOGPs are not specifically designed to image data. We thus propose a new MOGPs model called Multi-output Gaussian Processes with Augment & Reduce (MOGPs-AR) that can deal with large scale classification and downsized image input data. Large scale classification is achieved by subsampling both training data sets and classes in each output whereas downsized image input data is handled by incorporating a convolutional kernel into the new model. We show empirically that our proposed model outperforms single-output Gaussian processes in terms of different performance metrics and multi-output Gaussian processes in terms of scalability, both in synthetic and in real classification problems. We include an example with the Ommiglot dataset where we showcase the properties of our model