34,068 research outputs found
Regression with Sparse Approximations of Data
Publication in the conference proceedings of EUSIPCO, Bucharest, Romania, 201
Polynomial Response Surface Approximations for the Multidisciplinary Design Optimization of a High Speed Civil Transport
Surrogate functions have become an important tool in multidisciplinary design optimization to deal with noisy functions, high computational cost, and the practical difficulty of integrating legacy disciplinary computer codes. A combination of mathematical, statistical, and engineering techniques, well known in other contexts, have made polynomial surrogate functions viable for MDO. Despite the obvious limitations imposed by sparse high fidelity data in high dimensions and the locality of low order polynomial approximations, the success of the panoply of techniques based on polynomial response surface approximations for MDO shows that the implementation details are more important than the underlying approximation method (polynomial, spline, DACE, kernel regression, etc.). This paper surveys some of the ancillary techniques—statistics, global search, parallel computing, variable complexity modeling—that augment the construction and use of polynomial surrogates
Understanding and Comparing Scalable Gaussian Process Regression for Big Data
As a non-parametric Bayesian model which produces informative predictive
distribution, Gaussian process (GP) has been widely used in various fields,
like regression, classification and optimization. The cubic complexity of
standard GP however leads to poor scalability, which poses challenges in the
era of big data. Hence, various scalable GPs have been developed in the
literature in order to improve the scalability while retaining desirable
prediction accuracy. This paper devotes to investigating the methodological
characteristics and performance of representative global and local scalable GPs
including sparse approximations and local aggregations from four main
perspectives: scalability, capability, controllability and robustness. The
numerical experiments on two toy examples and five real-world datasets with up
to 250K points offer the following findings. In terms of scalability, most of
the scalable GPs own a time complexity that is linear to the training size. In
terms of capability, the sparse approximations capture the long-term spatial
correlations, the local aggregations capture the local patterns but suffer from
over-fitting in some scenarios. In terms of controllability, we could improve
the performance of sparse approximations by simply increasing the inducing
size. But this is not the case for local aggregations. In terms of robustness,
local aggregations are robust to various initializations of hyperparameters due
to the local attention mechanism. Finally, we highlight that the proper hybrid
of global and local scalable GPs may be a promising way to improve both the
model capability and scalability for big data.Comment: 25 pages, 15 figures, preprint submitted to KB
Variational Bayesian multinomial probit regression with Gaussian process priors
It is well known in the statistics literature that augmenting binary and polychotomous response models with Gaussian latent variables enables exact Bayesian analysis via Gibbs sampling from the parameter posterior. By adopting such a data augmentation strategy, dispensing with priors over regression coefficients in favour of Gaussian Process (GP) priors over functions, and employing variational approximations to the full posterior we obtain efficient computational methods for Gaussian Process classification in the multi-class setting. The model augmentation with additional latent variables ensures full a posteriori class coupling whilst retaining the simple a priori independent GP covariance structure from which sparse approximations, such as multi-class Informative Vector Machines (IVM), emerge in a very natural and straightforward manner. This is the first time that a fully Variational Bayesian treatment for multi-class GP classification has been developed without having to resort to additional explicit approximations to the non-Gaussian likelihood term. Empirical comparisons with exact analysis via MCMC and Laplace approximations illustrate the utility of the variational approximation as a computationally economic alternative to full MCMC and it is shown to be more accurate than the Laplace approximation
Distributed Gaussian Processes
To scale Gaussian processes (GPs) to large data sets we introduce the robust
Bayesian Committee Machine (rBCM), a practical and scalable product-of-experts
model for large-scale distributed GP regression. Unlike state-of-the-art sparse
GP approximations, the rBCM is conceptually simple and does not rely on
inducing or variational parameters. The key idea is to recursively distribute
computations to independent computational units and, subsequently, recombine
them to form an overall result. Efficient closed-form inference allows for
straightforward parallelisation and distributed computations with a small
memory footprint. The rBCM is independent of the computational graph and can be
used on heterogeneous computing infrastructures, ranging from laptops to
clusters. With sufficient computing resources our distributed GP model can
handle arbitrarily large data sets.Comment: 10 pages, 5 figures. Appears in Proceedings of ICML 201
Sparse Regression with Multi-type Regularized Feature Modeling
Within the statistical and machine learning literature, regularization
techniques are often used to construct sparse (predictive) models. Most
regularization strategies only work for data where all predictors are treated
identically, such as Lasso regression for (continuous) predictors treated as
linear effects. However, many predictive problems involve different types of
predictors and require a tailored regularization term. We propose a multi-type
Lasso penalty that acts on the objective function as a sum of subpenalties, one
for each type of predictor. As such, we allow for predictor selection and level
fusion within a predictor in a data-driven way, simultaneous with the parameter
estimation process. We develop a new estimation strategy for convex predictive
models with this multi-type penalty. Using the theory of proximal operators,
our estimation procedure is computationally efficient, partitioning the overall
optimization problem into easier to solve subproblems, specific for each
predictor type and its associated penalty. Earlier research applies
approximations to non-differentiable penalties to solve the optimization
problem. The proposed SMuRF algorithm removes the need for approximations and
achieves a higher accuracy and computational efficiency. This is demonstrated
with an extensive simulation study and the analysis of a case-study on
insurance pricing analytics
- …