Debiased Gaussian Process-based Machine Learning with Partially Observed Information

Abstract

Widely applicable machine learning and artificial intelligence technologies have resulted in an increasing demand for reliable models. Due to the ubiquitous data scarcity in the real world, model training can often be challenging and face limitations. Although various data augmentation techniques can efficiently alleviate this dilemma, bias is unavoidable, causing trade-offs in prediction accuracy. As this general dilemma is addressed, we discuss a Two-Stage Debiased Gaussian Process (TSDGP)-based machine learning model capable of providing robust and accurate predictions across various fields, even with partially observed information. Given the partially observed information in input data, the latent variable model was leveraged to enhance heterogeneous data utilization by reconstructing the unavailable information in stage one. Subsequently, the model and uncertainties from the first stage were refined within the Bayesian framework using the augmented dataset in stage two. By demonstrating the consistency and first and second moments of the proposed two-stage model, we are confident in the accuracy and robustness of the results. Supported by solid theoretical proof, we further evaluate the results of TSDGP through numerical and empirical experiments, showing the premium performances of the proposed approach. In conclusion, TSDGP can solve the dilemma caused by data scarcity in the real world—enabling a reliable high-fidelity predictive model to be trained on partially observed datasets without a significant trade-off in accuracy

Similar works

Full text

thumbnail-image

ScholarSpace at University of Hawai'i at Manoa

redirect
Last time updated on 01/02/2025

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: https://creativecommons.org/licenses/by-nc-nd/4.0/