83,628 research outputs found
Multi-Resolution Functional ANOVA for Large-Scale, Many-Input Computer Experiments
The Gaussian process is a standard tool for building emulators for both
deterministic and stochastic computer experiments. However, application of
Gaussian process models is greatly limited in practice, particularly for
large-scale and many-input computer experiments that have become typical. We
propose a multi-resolution functional ANOVA model as a computationally feasible
emulation alternative. More generally, this model can be used for large-scale
and many-input non-linear regression problems. An overlapping group lasso
approach is used for estimation, ensuring computational feasibility in a
large-scale and many-input setting. New results on consistency and inference
for the (potentially overlapping) group lasso in a high-dimensional setting are
developed and applied to the proposed multi-resolution functional ANOVA model.
Importantly, these results allow us to quantify the uncertainty in our
predictions. Numerical examples demonstrate that the proposed model enjoys
marked computational advantages. Data capabilities, both in terms of sample
size and dimension, meet or exceed best available emulation tools while meeting
or exceeding emulation accuracy
An update on statistical boosting in biomedicine
Statistical boosting algorithms have triggered a lot of research during the
last decade. They combine a powerful machine-learning approach with classical
statistical modelling, offering various practical advantages like automated
variable selection and implicit regularization of effect estimates. They are
extremely flexible, as the underlying base-learners (regression functions
defining the type of effect for the explanatory variables) can be combined with
any kind of loss function (target function to be optimized, defining the type
of regression setting). In this review article, we highlight the most recent
methodological developments on statistical boosting regarding variable
selection, functional regression and advanced time-to-event modelling.
Additionally, we provide a short overview on relevant applications of
statistical boosting in biomedicine
A new class of wavelet networks for nonlinear system identification
A new class of wavelet networks (WNs) is proposed for nonlinear system identification. In the new networks, the model structure for a high-dimensional system is chosen to be a superimposition of a number of functions with fewer variables. By expanding each function using truncated wavelet decompositions, the multivariate nonlinear networks can be converted into linear-in-the-parameter regressions, which can be solved using least-squares type methods. An efficient model term selection approach based upon a forward orthogonal least squares (OLS) algorithm and the error reduction ratio (ERR) is applied to solve the linear-in-the-parameters problem in the present study. The main advantage of the new WN is that it exploits the attractive features of multiscale wavelet decompositions and the capability of traditional neural networks. By adopting the analysis of variance (ANOVA) expansion, WNs can now handle nonlinear identification problems in high dimensions
Minimizing Negative Transfer of Knowledge in Multivariate Gaussian Processes: A Scalable and Regularized Approach
Recently there has been an increasing interest in the multivariate Gaussian
process (MGP) which extends the Gaussian process (GP) to deal with multiple
outputs. One approach to construct the MGP and account for non-trivial
commonalities amongst outputs employs a convolution process (CP). The CP is
based on the idea of sharing latent functions across several convolutions.
Despite the elegance of the CP construction, it provides new challenges that
need yet to be tackled. First, even with a moderate number of outputs, model
building is extremely prohibitive due to the huge increase in computational
demands and number of parameters to be estimated. Second, the negative transfer
of knowledge may occur when some outputs do not share commonalities. In this
paper we address these issues. We propose a regularized pairwise modeling
approach for the MGP established using CP. The key feature of our approach is
to distribute the estimation of the full multivariate model into a group of
bivariate GPs which are individually built. Interestingly pairwise modeling
turns out to possess unique characteristics, which allows us to tackle the
challenge of negative transfer through penalizing the latent function that
facilitates information sharing in each bivariate model. Predictions are then
made through combining predictions from the bivariate models within a Bayesian
framework. The proposed method has excellent scalability when the number of
outputs is large and minimizes the negative transfer of knowledge between
uncorrelated outputs. Statistical guarantees for the proposed method are
studied and its advantageous features are demonstrated through numerical
studies
- …