38,581 research outputs found

    Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification

    Full text link
    Gaussian processes are a natural way of defining prior distributions over functions of one or more input variables. In a simple nonparametric regression problem, where such a function gives the mean of a Gaussian distribution for an observed response, a Gaussian process model can easily be implemented using matrix computations that are feasible for datasets of up to about a thousand cases. Hyperparameters that define the covariance function of the Gaussian process can be sampled using Markov chain methods. Regression models where the noise has a t distribution and logistic or probit models for classification applications can be implemented by sampling as well for latent values underlying the observations. Software is now available that implements these methods using covariance functions with hierarchical parameterizations. Models defined in this way can discover high-level properties of the data, such as which inputs are relevant to predicting the response

    Bayesian Field Theory: Nonparametric Approaches to Density Estimation, Regression, Classification, and Inverse Quantum Problems

    Get PDF
    Bayesian field theory denotes a nonparametric Bayesian approach for learning functions from observational data. Based on the principles of Bayesian statistics, a particular Bayesian field theory is defined by combining two models: a likelihood model, providing a probabilistic description of the measurement process, and a prior model, providing the information necessary to generalize from training to non-training data. The particular likelihood models discussed in the paper are those of general density estimation, Gaussian regression, clustering, classification, and models specific for inverse quantum problems. Besides problem typical hard constraints, like normalization and positivity for probabilities, prior models have to implement all the specific, and often vague, "a priori" knowledge available for a specific task. Nonparametric prior models discussed in the paper are Gaussian processes, mixtures of Gaussian processes, and non-quadratic potentials. Prior models are made flexible by including hyperparameters. In particular, the adaption of mean functions and covariance operators of Gaussian process components is discussed in detail. Even if constructed using Gaussian process building blocks, Bayesian field theories are typically non-Gaussian and have thus to be solved numerically. According to increasing computational resources the class of non-Gaussian Bayesian field theories of practical interest which are numerically feasible is steadily growing. Models which turn out to be computationally too demanding can serve as starting point to construct easier to solve parametric approaches, using for example variational techniques.Comment: 200 pages, 99 figures, LateX; revised versio

    Performance assessment of a wind turbine using SCADA based Gaussian Process model

    Get PDF
    Loss of wind turbine power production identified through performance assessment is a useful tool for effective condition monitoring of a wind turbine. Power curves describe the nonlinear relationship between power generation and hub height wind speed and play a significant role in analyzing the performance of a turbine. Performance assessment using nonparametric models is gaining popularity. A Gaussian Process is a nonlinear, non-parametric probabilistic approach widely used for fitting models and forecasting applications due to its flexibility and mathematical simplicity. Its applications extended to both classification and regression related problems. Despite promising results, Gaussian Process application in wind turbine condition monitoring is limited. In this paper, a model based on a Gaussian Process is constructed for assessing the performance of a turbine. Here, a reference power curve using SCADA datasets from a healthy turbine is developed using a Gaussian Process and then is compared with a power curve from an unhealthy turbine. Error due to yaw misalignment is a common issue with wind turbine which causes underperformance, hence it is used as case study to test and validate the algorithm effectiveness

    Identifiable and interpretable nonparametric factor analysis

    Full text link
    Factor models have been widely used to summarize the variability of high-dimensional data through a set of factors with much lower dimensionality. Gaussian linear factor models have been particularly popular due to their interpretability and ease of computation. However, in practice, data often violate the multivariate Gaussian assumption. To characterize higher-order dependence and nonlinearity, models that include factors as predictors in flexible multivariate regression are popular, with GP-LVMs using Gaussian process (GP) priors for the regression function and VAEs using deep neural networks. Unfortunately, such approaches lack identifiability and interpretability and tend to produce brittle and non-reproducible results. To address these problems by simplifying the nonparametric factor model while maintaining flexibility, we propose the NIFTY framework, which parsimoniously transforms uniform latent variables using one-dimensional nonlinear mappings and then applies a linear generative model. The induced multivariate distribution falls into a flexible class while maintaining simple computation and interpretation. We prove that this model is identifiable and empirically study NIFTY using simulated data, observing good performance in density estimation and data visualization. We then apply NIFTY to bird song data in an environmental monitoring application.Comment: 50 pages, 17 figure

    Determining the Mass of Kepler-78b With Nonparametric Gaussian Process Estimation

    Get PDF
    Kepler-78b is a transiting planet that is 1.2 times the radius of Earth and orbits a young, active K dwarf every 8 hours. The mass of Kepler-78b has been independently reported by two teams based on radial velocity measurements using the HIRES and HARPS-N spectrographs. Due to the active nature of the host star, a stellar activity model is required to distinguish and isolate the planetary signal in radial velocity data. Whereas previous studies tested parametric stellar activity models, we modeled this system using nonparametric Gaussian process (GP) regression. We produced a GP regression of relevant Kepler photometry. We then use the posterior parameter distribution for our photometric fit as a prior for our simultaneous GP + Keplerian orbit models of the radial velocity datasets. We tested three simple kernel functions for our GP regressions. Based on a Bayesian likelihood analysis, we selected a quasi-periodic kernel model with GP hyperparameters coupled between the two RV datasets, giving a Doppler amplitude of 1.86 ±\pm 0.25 m s−1^{-1} and supporting our belief that the correlated noise we are modeling is astrophysical. The corresponding mass of 1.87 −0.26+0.27^{+0.27}_{-0.26} M⊕_{\oplus} is consistent with that measured in previous studies, and more robust due to our nonparametric signal estimation. Based on our mass and the radius measurement from transit photometry, Kepler-78b has a bulk density of 6.0−1.4+1.9^{+1.9}_{-1.4} g cm−3^{-3}. We estimate that Kepler-78b is 32±\pm26% iron using a two-component rock-iron model. This is consistent with an Earth-like composition, with uncertainty spanning Moon-like to Mercury-like compositions.Comment: 10 pages, 5 figures, accepted to ApJ 6/16/201

    SCADA based nonparametric models for condition monitoring of a wind turbine

    Get PDF
    High operation and maintenance costs for offshore wind turbines push up the LCOE of offshore wind energy. Unscheduled maintenance due to unanticipated failures is the most prominent driver of the maintenance cost which reinforces the drive towards condition-based maintenance. SCADA based condition monitoring is a cost-effective approach where power curve used to assess the performance of a wind turbine. Such power curves are useful in identification of wind turbine abnormal behaviour. IEC standard 61400-12-1 outlines the guidelines for power curve modelling based on binning. However, establishing such a power curve takes considerable time and is far too slow to reflect changes in performance to be used directly for condition monitoring. To address this, data-driven, nonparametric models being used instead. Gaussian Process models and regression trees are commonly used nonlinear, nonparametric models useful in forecasting and prediction applications. In this paper, two nonparametric methods are proposed for power curve modelling. The Gaussian Process treated as the benchmark model, and a comparative analysis was undertaken using a Regression tree model; the advantages and limitations of each model will be outlined. The performance of these regression models is validated using readily available SCADA datasets from a healthy wind turbine operating under normal conditions
    • …
    corecore