Search CORE

247 research outputs found

Projection predictive model selection for Gaussian processes

Author: Piironen Juho
Vehtari Aki
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/08/2016
Field of study

We propose a new method for simplification of Gaussian process (GP) models by projecting the information contained in the full encompassing model and selecting a reduced number of variables based on their predictive relevance. Our results on synthetic and real world datasets show that the proposed method improves the assessment of variable relevance compared to the automatic relevance determination (ARD) via the length-scale parameters. We expect the method to be useful for improving explainability of the models, reducing the future measurement costs and reducing the computation time for making new predictions.Comment: A few minor changes in tex

arXiv.org e-Print Archive

Crossref

Approximate Inference for Nonstationary Heteroscedastic Gaussian process Regression

Author: Jylänki Pasi
Tolvanen Ville
Vehtari Aki
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

This paper presents a novel approach for approximate integration over the uncertainty of noise and signal variances in Gaussian process (GP) regression. Our efficient and straightforward approach can also be applied to integration over input dependent noise variance (heteroscedasticity) and input dependent signal variance (nonstationarity) by setting independent GP priors for the noise and signal variances. We use expectation propagation (EP) for inference and compare results to Markov chain Monte Carlo in two simulated data sets and three empirical examples. The results show that EP produces comparable results with less computational burden

arXiv.org e-Print Archive

Crossref

Radboud Repository

Efficient estimation and correction of selection-induced bias with order statistics

Author: McLatchie Yann
Vehtari Aki
Publication venue
Publication date: 14/09/2023
Field of study

Model selection aims to identify a sufficiently well performing model that is possibly simpler than the most complex model among a pool of candidates. However, the decision-making process itself can inadvertently introduce non-negligible bias when the cross-validation estimates of predictive performance are marred by excessive noise. In finite data regimes, cross-validated estimates can encourage the statistician to select one model over another when it is not actually better for future data. While this bias remains negligible in the case of few models, when the pool of candidates grows, and model selection decisions are compounded (as in forward search), the expected magnitude of selection-induced bias is likely to grow too. This paper introduces an efficient approach to estimate and correct selection-induced bias based on order statistics. Numerical experiments demonstrate the reliability of our approach in estimating both selection-induced bias and over-fitting along compounded model selection decisions, with specific application to forward search. This work represents a light-weight alternative to more computationally expensive approaches to correcting selection-induced bias, such as nested cross-validation and the bootstrap. Our approach rests on several theoretic assumptions, and we provide a diagnostic to help understand when these may not be valid and when to fall back on safer, albeit more computationally expensive approaches. The accompanying code facilitates its practical implementation and fosters further exploration in this area.Comment: 20 (+6) pages; 8 (+4) figure

arXiv.org e-Print Archive

Bayesian leave-one-out cross-validation for large data

Author: Andersen Michael Riis
Jonasson Johan
Magnusson Måns
Vehtari Aki
Publication venue
Publication date: 01/01/2019
Field of study

Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data.Comment: Accepted to ICML 2019. This version is the submitted pape

arXiv.org e-Print Archive

Aaltodoc Publication Archive

Chalmers Research

Online Research Database In Technology