38,581 research outputs found
Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification
Gaussian processes are a natural way of defining prior distributions over
functions of one or more input variables. In a simple nonparametric regression
problem, where such a function gives the mean of a Gaussian distribution for an
observed response, a Gaussian process model can easily be implemented using
matrix computations that are feasible for datasets of up to about a thousand
cases. Hyperparameters that define the covariance function of the Gaussian
process can be sampled using Markov chain methods. Regression models where the
noise has a t distribution and logistic or probit models for classification
applications can be implemented by sampling as well for latent values
underlying the observations. Software is now available that implements these
methods using covariance functions with hierarchical parameterizations. Models
defined in this way can discover high-level properties of the data, such as
which inputs are relevant to predicting the response
Bayesian Field Theory: Nonparametric Approaches to Density Estimation, Regression, Classification, and Inverse Quantum Problems
Bayesian field theory denotes a nonparametric Bayesian approach for learning
functions from observational data. Based on the principles of Bayesian
statistics, a particular Bayesian field theory is defined by combining two
models: a likelihood model, providing a probabilistic description of the
measurement process, and a prior model, providing the information necessary to
generalize from training to non-training data. The particular likelihood models
discussed in the paper are those of general density estimation, Gaussian
regression, clustering, classification, and models specific for inverse quantum
problems. Besides problem typical hard constraints, like normalization and
positivity for probabilities, prior models have to implement all the specific,
and often vague, "a priori" knowledge available for a specific task.
Nonparametric prior models discussed in the paper are Gaussian processes,
mixtures of Gaussian processes, and non-quadratic potentials. Prior models are
made flexible by including hyperparameters. In particular, the adaption of mean
functions and covariance operators of Gaussian process components is discussed
in detail. Even if constructed using Gaussian process building blocks, Bayesian
field theories are typically non-Gaussian and have thus to be solved
numerically. According to increasing computational resources the class of
non-Gaussian Bayesian field theories of practical interest which are
numerically feasible is steadily growing. Models which turn out to be
computationally too demanding can serve as starting point to construct easier
to solve parametric approaches, using for example variational techniques.Comment: 200 pages, 99 figures, LateX; revised versio
Performance assessment of a wind turbine using SCADA based Gaussian Process model
Loss of wind turbine power production identified through performance assessment is a useful tool for effective condition monitoring of a wind turbine. Power curves describe the nonlinear relationship between power generation and hub height wind speed and play a significant role in analyzing the performance of a turbine. Performance assessment using nonparametric models is gaining popularity. A Gaussian Process is a nonlinear, non-parametric probabilistic approach widely used for fitting models and forecasting applications due to its flexibility and mathematical simplicity. Its applications extended to both classification and regression related problems. Despite promising results, Gaussian Process application in wind turbine condition monitoring is limited. In this paper, a model based on a Gaussian Process is constructed for assessing the performance of a turbine. Here, a reference power curve using SCADA datasets from a healthy turbine is developed using a Gaussian Process and then is compared with a power curve from an unhealthy turbine. Error due to yaw misalignment is a common issue with wind turbine which causes underperformance, hence it is used as case study to test and validate the algorithm effectiveness
Identifiable and interpretable nonparametric factor analysis
Factor models have been widely used to summarize the variability of
high-dimensional data through a set of factors with much lower dimensionality.
Gaussian linear factor models have been particularly popular due to their
interpretability and ease of computation. However, in practice, data often
violate the multivariate Gaussian assumption. To characterize higher-order
dependence and nonlinearity, models that include factors as predictors in
flexible multivariate regression are popular, with GP-LVMs using Gaussian
process (GP) priors for the regression function and VAEs using deep neural
networks. Unfortunately, such approaches lack identifiability and
interpretability and tend to produce brittle and non-reproducible results. To
address these problems by simplifying the nonparametric factor model while
maintaining flexibility, we propose the NIFTY framework, which parsimoniously
transforms uniform latent variables using one-dimensional nonlinear mappings
and then applies a linear generative model. The induced multivariate
distribution falls into a flexible class while maintaining simple computation
and interpretation. We prove that this model is identifiable and empirically
study NIFTY using simulated data, observing good performance in density
estimation and data visualization. We then apply NIFTY to bird song data in an
environmental monitoring application.Comment: 50 pages, 17 figure
Determining the Mass of Kepler-78b With Nonparametric Gaussian Process Estimation
Kepler-78b is a transiting planet that is 1.2 times the radius of Earth and
orbits a young, active K dwarf every 8 hours. The mass of Kepler-78b has been
independently reported by two teams based on radial velocity measurements using
the HIRES and HARPS-N spectrographs. Due to the active nature of the host star,
a stellar activity model is required to distinguish and isolate the planetary
signal in radial velocity data. Whereas previous studies tested parametric
stellar activity models, we modeled this system using nonparametric Gaussian
process (GP) regression. We produced a GP regression of relevant Kepler
photometry. We then use the posterior parameter distribution for our
photometric fit as a prior for our simultaneous GP + Keplerian orbit models of
the radial velocity datasets. We tested three simple kernel functions for our
GP regressions. Based on a Bayesian likelihood analysis, we selected a
quasi-periodic kernel model with GP hyperparameters coupled between the two RV
datasets, giving a Doppler amplitude of 1.86 0.25 m s and
supporting our belief that the correlated noise we are modeling is
astrophysical. The corresponding mass of 1.87 M
is consistent with that measured in previous studies, and more robust due to
our nonparametric signal estimation. Based on our mass and the radius
measurement from transit photometry, Kepler-78b has a bulk density of
6.0 g cm. We estimate that Kepler-78b is 3226% iron
using a two-component rock-iron model. This is consistent with an Earth-like
composition, with uncertainty spanning Moon-like to Mercury-like compositions.Comment: 10 pages, 5 figures, accepted to ApJ 6/16/201
SCADA based nonparametric models for condition monitoring of a wind turbine
High operation and maintenance costs for offshore wind turbines push up the LCOE of offshore wind energy. Unscheduled maintenance due to unanticipated failures is the most prominent driver of the maintenance cost which reinforces the drive towards condition-based maintenance. SCADA based condition monitoring is a cost-effective approach where power curve used to assess the performance of a wind turbine. Such power curves are useful in identification of wind turbine abnormal behaviour. IEC standard 61400-12-1 outlines the guidelines for power curve modelling based on binning. However, establishing such a power curve takes considerable time and is far too slow to reflect changes in performance to be used directly for condition monitoring. To address this, data-driven, nonparametric models being used instead. Gaussian Process models and regression trees are commonly used nonlinear, nonparametric models useful in forecasting and prediction applications. In this paper, two nonparametric methods are proposed for power curve modelling. The Gaussian Process treated as the benchmark model, and a comparative analysis was undertaken using a Regression tree model; the advantages and limitations of each model will be outlined. The performance of these regression models is validated using readily available SCADA datasets from a healthy wind turbine operating under normal conditions
- …