8,838 research outputs found
Building nonparametric -body force fields using Gaussian process regression
Constructing a classical potential suited to simulate a given atomic system
is a remarkably difficult task. This chapter presents a framework under which
this problem can be tackled, based on the Bayesian construction of
nonparametric force fields of a given order using Gaussian process (GP) priors.
The formalism of GP regression is first reviewed, particularly in relation to
its application in learning local atomic energies and forces. For accurate
regression it is fundamental to incorporate prior knowledge into the GP kernel
function. To this end, this chapter details how properties of smoothness,
invariance and interaction order of a force field can be encoded into
corresponding kernel properties. A range of kernels is then proposed,
possessing all the required properties and an adjustable parameter
governing the interaction order modelled. The order best suited to describe
a given system can be found automatically within the Bayesian framework by
maximisation of the marginal likelihood. The procedure is first tested on a toy
model of known interaction and later applied to two real materials described at
the DFT level of accuracy. The models automatically selected for the two
materials were found to be in agreement with physical intuition. More in
general, it was found that lower order (simpler) models should be chosen when
the data are not sufficient to resolve more complex interactions. Low GPs
can be further sped up by orders of magnitude by constructing the corresponding
tabulated force field, here named "MFF".Comment: 31 pages, 11 figures, book chapte
Replica theory for learning curves for Gaussian processes on random graphs
Statistical physics approaches can be used to derive accurate predictions for
the performance of inference methods learning from potentially noisy data, as
quantified by the learning curve defined as the average error versus number of
training examples. We analyse a challenging problem in the area of
non-parametric inference where an effectively infinite number of parameters has
to be learned, specifically Gaussian process regression. When the inputs are
vertices on a random graph and the outputs noisy function values, we show that
replica techniques can be used to obtain exact performance predictions in the
limit of large graphs. The covariance of the Gaussian process prior is defined
by a random walk kernel, the discrete analogue of squared exponential kernels
on continuous spaces. Conventionally this kernel is normalised only globally,
so that the prior variance can differ between vertices; as a more principled
alternative we consider local normalisation, where the prior variance is
uniform
Using Program Synthesis for Program Analysis
In this paper, we identify a fragment of second-order logic with restricted
quantification that is expressive enough to capture numerous static analysis
problems (e.g. safety proving, bug finding, termination and non-termination
proving, superoptimisation). We call this fragment the {\it synthesis
fragment}. Satisfiability of a formula in the synthesis fragment is decidable
over finite domains; specifically the decision problem is NEXPTIME-complete. If
a formula in this fragment is satisfiable, a solution consists of a satisfying
assignment from the second order variables to \emph{functions over finite
domains}. To concretely find these solutions, we synthesise \emph{programs}
that compute the functions. Our program synthesis algorithm is complete for
finite state programs, i.e. every \emph{function} over finite domains is
computed by some \emph{program} that we can synthesise. We can therefore use
our synthesiser as a decision procedure for the synthesis fragment of
second-order logic, which in turn allows us to use it as a powerful backend for
many program analysis tasks. To show the tractability of our approach, we
evaluate the program synthesiser on several static analysis problems.Comment: 19 pages, to appear in LPAR 2015. arXiv admin note: text overlap with
arXiv:1409.492
Quantum machine learning: a classical perspective
Recently, increased computational power and data availability, as well as
algorithmic advances, have led machine learning techniques to impressive
results in regression, classification, data-generation and reinforcement
learning tasks. Despite these successes, the proximity to the physical limits
of chip fabrication alongside the increasing size of datasets are motivating a
growing number of researchers to explore the possibility of harnessing the
power of quantum computation to speed-up classical machine learning algorithms.
Here we review the literature in quantum machine learning and discuss
perspectives for a mixed readership of classical machine learning and quantum
computation experts. Particular emphasis will be placed on clarifying the
limitations of quantum algorithms, how they compare with their best classical
counterparts and why quantum resources are expected to provide advantages for
learning problems. Learning in the presence of noise and certain
computationally hard problems in machine learning are identified as promising
directions for the field. Practical questions, like how to upload classical
data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde
Is SGD a Bayesian sampler? Well, almost
Overparameterised deep neural networks (DNNs) are highly expressive and so
can, in principle, generate almost any function that fits a training dataset
with zero error. The vast majority of these functions will perform poorly on
unseen data, and yet in practice DNNs often generalise remarkably well. This
success suggests that a trained DNN must have a strong inductive bias towards
functions with low generalisation error. Here we empirically investigate this
inductive bias by calculating, for a range of architectures and datasets, the
probability that an overparameterised DNN, trained with
stochastic gradient descent (SGD) or one of its variants, converges on a
function consistent with a training set . We also use Gaussian processes
to estimate the Bayesian posterior probability that the DNN
expresses upon random sampling of its parameters, conditioned on .
Our main findings are that correlates remarkably well with
and that is strongly biased towards low-error and
low complexity functions. These results imply that strong inductive bias in the
parameter-function map (which determines ), rather than a special
property of SGD, is the primary explanation for why DNNs generalise so well in
the overparameterised regime.
While our results suggest that the Bayesian posterior is the
first order determinant of , there remain second order
differences that are sensitive to hyperparameter tuning. A function probability
picture, based on and/or , can shed new light
on the way that variations in architecture or hyperparameter settings such as
batch size, learning rate, and optimiser choice, affect DNN performance
- …