8,838 research outputs found

    Building nonparametric nn-body force fields using Gaussian process regression

    Full text link
    Constructing a classical potential suited to simulate a given atomic system is a remarkably difficult task. This chapter presents a framework under which this problem can be tackled, based on the Bayesian construction of nonparametric force fields of a given order using Gaussian process (GP) priors. The formalism of GP regression is first reviewed, particularly in relation to its application in learning local atomic energies and forces. For accurate regression it is fundamental to incorporate prior knowledge into the GP kernel function. To this end, this chapter details how properties of smoothness, invariance and interaction order of a force field can be encoded into corresponding kernel properties. A range of kernels is then proposed, possessing all the required properties and an adjustable parameter nn governing the interaction order modelled. The order nn best suited to describe a given system can be found automatically within the Bayesian framework by maximisation of the marginal likelihood. The procedure is first tested on a toy model of known interaction and later applied to two real materials described at the DFT level of accuracy. The models automatically selected for the two materials were found to be in agreement with physical intuition. More in general, it was found that lower order (simpler) models should be chosen when the data are not sufficient to resolve more complex interactions. Low nn GPs can be further sped up by orders of magnitude by constructing the corresponding tabulated force field, here named "MFF".Comment: 31 pages, 11 figures, book chapte

    Replica theory for learning curves for Gaussian processes on random graphs

    Full text link
    Statistical physics approaches can be used to derive accurate predictions for the performance of inference methods learning from potentially noisy data, as quantified by the learning curve defined as the average error versus number of training examples. We analyse a challenging problem in the area of non-parametric inference where an effectively infinite number of parameters has to be learned, specifically Gaussian process regression. When the inputs are vertices on a random graph and the outputs noisy function values, we show that replica techniques can be used to obtain exact performance predictions in the limit of large graphs. The covariance of the Gaussian process prior is defined by a random walk kernel, the discrete analogue of squared exponential kernels on continuous spaces. Conventionally this kernel is normalised only globally, so that the prior variance can differ between vertices; as a more principled alternative we consider local normalisation, where the prior variance is uniform

    Using Program Synthesis for Program Analysis

    Get PDF
    In this paper, we identify a fragment of second-order logic with restricted quantification that is expressive enough to capture numerous static analysis problems (e.g. safety proving, bug finding, termination and non-termination proving, superoptimisation). We call this fragment the {\it synthesis fragment}. Satisfiability of a formula in the synthesis fragment is decidable over finite domains; specifically the decision problem is NEXPTIME-complete. If a formula in this fragment is satisfiable, a solution consists of a satisfying assignment from the second order variables to \emph{functions over finite domains}. To concretely find these solutions, we synthesise \emph{programs} that compute the functions. Our program synthesis algorithm is complete for finite state programs, i.e. every \emph{function} over finite domains is computed by some \emph{program} that we can synthesise. We can therefore use our synthesiser as a decision procedure for the synthesis fragment of second-order logic, which in turn allows us to use it as a powerful backend for many program analysis tasks. To show the tractability of our approach, we evaluate the program synthesiser on several static analysis problems.Comment: 19 pages, to appear in LPAR 2015. arXiv admin note: text overlap with arXiv:1409.492

    Quantum machine learning: a classical perspective

    Get PDF
    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde

    Is SGD a Bayesian sampler? Well, almost

    Full text link
    Overparameterised deep neural networks (DNNs) are highly expressive and so can, in principle, generate almost any function that fits a training dataset with zero error. The vast majority of these functions will perform poorly on unseen data, and yet in practice DNNs often generalise remarkably well. This success suggests that a trained DNN must have a strong inductive bias towards functions with low generalisation error. Here we empirically investigate this inductive bias by calculating, for a range of architectures and datasets, the probability PSGD(fS)P_{SGD}(f\mid S) that an overparameterised DNN, trained with stochastic gradient descent (SGD) or one of its variants, converges on a function ff consistent with a training set SS. We also use Gaussian processes to estimate the Bayesian posterior probability PB(fS)P_B(f\mid S) that the DNN expresses ff upon random sampling of its parameters, conditioned on SS. Our main findings are that PSGD(fS)P_{SGD}(f\mid S) correlates remarkably well with PB(fS)P_B(f\mid S) and that PB(fS)P_B(f\mid S) is strongly biased towards low-error and low complexity functions. These results imply that strong inductive bias in the parameter-function map (which determines PB(fS)P_B(f\mid S)), rather than a special property of SGD, is the primary explanation for why DNNs generalise so well in the overparameterised regime. While our results suggest that the Bayesian posterior PB(fS)P_B(f\mid S) is the first order determinant of PSGD(fS)P_{SGD}(f\mid S), there remain second order differences that are sensitive to hyperparameter tuning. A function probability picture, based on PSGD(fS)P_{SGD}(f\mid S) and/or PB(fS)P_B(f\mid S), can shed new light on the way that variations in architecture or hyperparameter settings such as batch size, learning rate, and optimiser choice, affect DNN performance
    corecore