Search CORE

5,644 research outputs found

Building nonparametric $n$ -body force fields using Gaussian process regression

Author: A Glielmo
A Glielmo
A Grisafi
A Takahashi
AJ Skinner
AP Bartók
AP Bartók
AP Bartók
AP Bartók
AP Thompson
AV Shapeev
B Haasdonk
C Zeni
CA Micchelli
CE Rasmussen
CE Rasmussen
CKI Williams
DH Wolpert
FH Stillinger
G Ferré
GA Cisneros
I Kruglov
I Kruglov
I Macêdo
J Behler
J Behler
J Mavračić
J Tersoff
K Hansen
K Hornik
K Yao
KT Schütt
KT Schütt
L Breiman
LM Ghiringhelli
M Gastegger
M Rupp
M Rupp
MJ Kearns
N Kuritz
N Lubbers
O Sagi
P Geiger
RA Jacobs
RP Feynman
S Chmiela
S De
S De
S Manzhos
SK Reddy
T Bereau
V Botu
VL Deringer
VN Vapnik
VN Vapnik
WH Jefferys
WJ Szlachta
Z Ghahramani
Z Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/05/2019
Field of study

Constructing a classical potential suited to simulate a given atomic system is a remarkably difficult task. This chapter presents a framework under which this problem can be tackled, based on the Bayesian construction of nonparametric force fields of a given order using Gaussian process (GP) priors. The formalism of GP regression is first reviewed, particularly in relation to its application in learning local atomic energies and forces. For accurate regression it is fundamental to incorporate prior knowledge into the GP kernel function. To this end, this chapter details how properties of smoothness, invariance and interaction order of a force field can be encoded into corresponding kernel properties. A range of kernels is then proposed, possessing all the required properties and an adjustable parameter

n

governing the interaction order modelled. The order

n

best suited to describe a given system can be found automatically within the Bayesian framework by maximisation of the marginal likelihood. The procedure is first tested on a toy model of known interaction and later applied to two real materials described at the DFT level of accuracy. The models automatically selected for the two materials were found to be in agreement with physical intuition. More in general, it was found that lower order (simpler) models should be chosen when the data are not sufficient to resolve more complex interactions. Low

n

GPs can be further sped up by orders of magnitude by constructing the corresponding tabulated force field, here named "MFF".Comment: 31 pages, 11 figures, book chapte

arXiv.org e-Print Archive

Crossref

Recommended from our members

Nonparametric regression analysis

Author: Malloy Shuling Guo
Publication venue
Publication date: 16/11/2015
Field of study

textNonparametric regression uses nonparametric and flexible methods in analyzing complex data with unknown regression relationships by imposing minimum assumptions on the regression function. The theory and applications of nonparametric regression methods with an emphasis on kernel regression, smoothing spines and Gaussian process regression are reviewed in this report. Two datasets are analyzed to demonstrate and compare the three nonparametric regression models in R.Statistic

Texas ScholarWorks

Bayesian Approximate Kernel Regression with Variable Selection

Author: Crawford Lorin
Mukherjee Sayan
Wood Kris C.
Zhou Xiang
Publication venue
Publication date: 09/06/2017
Field of study

Nonlinear kernel regression models are often used in statistics and machine learning because they are more accurate than linear models. Variable selection for kernel regression models is a challenge partly because, unlike the linear regression setting, there is no clear concept of an effect size for regression coefficients. In this paper, we propose a novel framework that provides an effect size analog of each explanatory variable for Bayesian kernel regression models when the kernel is shift-invariant --- for example, the Gaussian kernel. We use function analytic properties of shift-invariant reproducing kernel Hilbert spaces (RKHS) to define a linear vector space that: (i) captures nonlinear structure, and (ii) can be projected onto the original explanatory variables. The projection onto the original explanatory variables serves as an analog of effect sizes. The specific function analytic property we use is that shift-invariant kernel functions can be approximated via random Fourier bases. Based on the random Fourier expansion we propose a computationally efficient class of Bayesian approximate kernel regression (BAKR) models for both nonlinear regression and binary classification for which one can compute an analog of effect sizes. We illustrate the utility of BAKR by examining two important problems in statistical genetics: genomic selection (i.e. phenotypic prediction) and association mapping (i.e. inference of significant variants or loci). State-of-the-art methods for genomic selection and association mapping are based on kernel regression and linear models, respectively. BAKR is the first method that is competitive in both settings.Comment: 22 pages, 3 figures, 3 tables; theory added; new simulations presented; references adde

arXiv.org e-Print Archive

FigShare

Stochastic expansions using continuous dictionaries: L\'{e}vy adaptive regression kernels

Author: Clyde Merlise A.
Tu Chong
Wolpert Robert L.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/08/2011
Field of study

This article describes a new class of prior distributions for nonparametric function estimation. The unknown function is modeled as a limit of weighted sums of kernels or generator functions indexed by continuous parameters that control local and global features such as their translation, dilation, modulation and shape. L\'{e}vy random fields and their stochastic integrals are employed to induce prior distributions for the unknown functions or, equivalently, for the number of kernels and for the parameters governing their features. Scaling, shape, and other features of the generating functions are location-specific to allow quite different function properties in different parts of the space, as with wavelet bases and other methods employing overcomplete dictionaries. We provide conditions under which the stochastic expansions converge in specified Besov or Sobolev norms. Under a Gaussian error model, this may be viewed as a sparse regression problem, with regularization induced via the L\'{e}vy random field prior distribution. Posterior inference for the unknown functions is based on a reversible jump Markov chain Monte Carlo algorithm. We compare the L\'{e}vy Adaptive Regression Kernel (LARK) method to wavelet-based methods using some of the standard test functions, and illustrate its flexibility and adaptability in nonstationary applications.Comment: Published in at http://dx.doi.org/10.1214/11-AOS889 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

DukeSpace