16 research outputs found

    Controversy in mechanistic modelling with Gaussian processes

    Get PDF
    Parameter inference in mechanistic models based on non-affine differential equations is computationally onerous, and various faster alternatives based on gradient matching have been proposed. A particularly promising approach is based on nonparametric Bayesian modelling with Gaussian processes, which exploits the fact that a Gaussian process is closed under differentiation. However, two alternative paradigms have been proposed. The first paradigm, proposed at NIPS 2008 and AISTATS 2013, is based on a product of experts approach and a marginalization over the derivatives of the state variables. The second paradigm, proposed at ICML 2014, is based on a probabilistic generative model and a marginalization over the state variables. The claim has been made that this leads to better inference results. In the present article, we offer a new interpretation of the second paradigm, which highlights the underlying assumptions, approximations and limitations. In particular, we show that the second paradigm suffers from an intrinsic identifiability problem, which the first paradigm is not affected by

    Identifying Sources and Sinks in the Presence of Multiple Agents with Gaussian Process Vector Calculus

    Full text link
    In systems of multiple agents, identifying the cause of observed agent dynamics is challenging. Often, these agents operate in diverse, non-stationary environments, where models rely on hand-crafted environment-specific features to infer influential regions in the system's surroundings. To overcome the limitations of these inflexible models, we present GP-LAPLACE, a technique for locating sources and sinks from trajectories in time-varying fields. Using Gaussian processes, we jointly infer a spatio-temporal vector field, as well as canonical vector calculus operations on that field. Notably, we do this from only agent trajectories without requiring knowledge of the environment, and also obtain a metric for denoting the significance of inferred causal features in the environment by exploiting our probabilistic method. To evaluate our approach, we apply it to both synthetic and real-world GPS data, demonstrating the applicability of our technique in the presence of multiple agents, as well as its superiority over existing methods.Comment: KDD '18 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Pages 1254-1262, 9 pages, 5 figures, conference submission, University of Oxford. arXiv admin note: text overlap with arXiv:1709.0235

    Approximate parameter inference in systems biology using gradient matching: a comparative evaluation

    Get PDF
    Background: A challenging problem in current systems biology is that of parameter inference in biological pathways expressed as coupled ordinary differential equations (ODEs). Conventional methods that repeatedly numerically solve the ODEs have large associated computational costs. Aimed at reducing this cost, new concepts using gradient matching have been proposed, which bypass the need for numerical integration. This paper presents a recently established adaptive gradient matching approach, using Gaussian processes, combined with a parallel tempering scheme, and conducts a comparative evaluation with current state of the art methods used for parameter inference in ODEs. Among these contemporary methods is a technique based on reproducing kernel Hilbert spaces (RKHS). This has previously shown promising results for parameter estimation, but under lax experimental settings. We look at a range of scenarios to test the robustness of this method. We also change the approach of inferring the penalty parameter from AIC to cross validation to improve the stability of the method. Methodology: Methodology for the recently proposed adaptive gradient matching method using Gaussian processes, upon which we build our new method, is provided. Details of a competing method using reproducing kernel Hilbert spaces are also described here. Results: We conduct a comparative analysis for the methods described in this paper, using two benchmark ODE systems. The analyses are repeated under different experimental settings, to observe the sensitivity of the techniques. Conclusions: Our study reveals that for known noise variance, our proposed method based on Gaussian processes and parallel tempering achieves overall the best performance. When the noise variance is unknown, the RKHS method proves to be more robust

    Model selection via marginal likelihood estimation by combining thermodynamic integration and gradient matching

    Get PDF
    Conducting statistical inference on systems described by ordinary differential equations (ODEs) is a challenging problem. Repeatedly numerically solving the system of equations incurs a high computational cost, making many methods based on explicitly solving the ODEs unsuitable in practice. Gradient matching methods were introduced in order to deal with the computational burden. These methods involve minimising the discrepancy between predicted gradients from the ODEs and those from a smooth interpolant. Work until now on gradient matching methods has focused on parameter inference. This paper considers the problem of model selection. We combine the method of thermodynamic integration to compute the log marginal likelihood with adaptive gradient matching using Gaussian processes, demonstrating that the method is robust and able to outperform BIC and WAIC

    Relative sea-level change in Connecticut (USA) during the last 2200 yrs

    Get PDF
    We produced a relative sea-level (RSL) reconstruction from Connecticut (USA) spanning the last ∼2200 yrs that is free from the influence of sediment compaction. The reconstruction used a suite of vertically- and laterally-ordered sediment samples ≤2 cm above bedrock that were collected by excavating a trench along an evenly-sloped bedrock surface. Paleomarsh elevation was reconstructed using a regional-scale transfer function trained on the modern distribution of foraminifera on Long Island Sound salt marshes and supported by bulk-sediment δ13C measurements. The history of sediment accumulation was estimated using an age-elevation model constrained by radiocarbon dates and recognition of pollution horizons of known age. The RSL reconstruction was combined with regional tide-gauge measurements spanning the last ∼150 yrs before being quantitatively analyzed using an error-in-variables integrated Gaussian process model to identify sea-level trends with formal and appropriate treatment of uncertainty and the temporal distribution of data. RSL rise was stable (∼1 mm/yr) from ∼200 BCE to ∼1000 CE, slowed to a minimum rate of rise (0.41 mm/yr) at ∼1400 CE, and then accelerated continuously to reach a current rate of 3.2 mm/yr, which is the fastest, century-scale rate of the last 2200 yrs. Change point analysis identified that modern rates of rise in Connecticut began at 1850–1886 CE. This timing is synchronous with changes recorded at other sites on the U.S. Atlantic coast and is likely the local expression of a global sea-level change. Earlier sea-level trends show coherence north of Cape Hatteras that are contrasted with southern sites. This pattern may represent centennial-scale variability in the position and/or strength of the Gulf Stream. Comparison of the new record to three existing and reanalyzed RSL reconstructions from the same site developed using sediment cores indicates that compaction is unlikely to significantly distort RSL reconstructions produced from shallow (∼2–3 m thick) sequences of salt-marsh peat

    Optimal plug-in Gaussian processes for modelling derivatives

    Full text link
    Derivatives are a key nonparametric functional in wide-ranging applications where the rate of change of an unknown function is of interest. In the Bayesian paradigm, Gaussian processes (GPs) are routinely used as a flexible prior for unknown functions, and are arguably one of the most popular tools in many areas. However, little is known about the optimal modelling strategy and theoretical properties when using GPs for derivatives. In this article, we study a plug-in strategy by differentiating the posterior distribution with GP priors for derivatives of any order. This practically appealing plug-in GP method has been previously perceived as suboptimal and degraded, but this is not necessarily the case. We provide posterior contraction rates for plug-in GPs and establish that they remarkably adapt to derivative orders. We show that the posterior measure of the regression function and its derivatives, with the same choice of hyperparameter that does not depend on the order of derivatives, converges at the minimax optimal rate up to a logarithmic factor for functions in certain classes. We analyze a data-driven hyperparameter tuning method based on empirical Bayes, and show that it satisfies the optimal rate condition while maintaining computational efficiency. This article to the best of our knowledge provides the first positive result for plug-in GPs in the context of inferring derivative functionals, and leads to a practically simple nonparametric Bayesian method with optimal and adaptive hyperparameter tuning for simultaneously estimating the regression function and its derivatives. Simulations show competitive finite sample performance of the plug-in GP method. A climate change application for analyzing the global sea-level rise is discussed.Comment: This paper supersedes the second part of the technical report available at arXiv:2011.13967v1. That technical report has been split: The first part on equivalence theory will be extended and become 2011.13967v2. The results on Bayesian inference for function derivatives have evolved into this paper. arXiv admin note: text overlap with arXiv:2011.1396

    Bayesian Optimal Design for Ordinary Differential Equation Models With Application in Biological Science

    Get PDF
    © 2019 The Author(s). Bayesian optimal design is considered for experiments where the response distribution depends on the solution to a system of nonlinear ordinary differential equations. The motivation is an experiment to estimate parameters in the equations governing the transport of amino acids through cell membranes in human placentas. Decision-theoretic Bayesian design of experiments for such nonlinear models is conceptually very attractive, allowing the formal incorporation of prior knowledge to overcome the parameter dependence of frequentist design and being less reliant on asymptotic approximations. However, the necessary approximation and maximization of the, typically analytically intractable, expected utility results in a computationally challenging problem. These issues are further exacerbated if the solution to the differential equations is not available in closed-form. This article proposes a new combination of a probabilistic solution to the equations embedded within a Monte Carlo approximation to the expected utility with cyclic descent of a smooth approximation to find the optimal design. A novel precomputation algorithm reduces the computational burden, making the search for an optimal design feasible for bigger problems. The methods are demonstrated by finding new designs for a number of common models derived from differential equations, and by providing optimal designs for the placenta experiment. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.The second author was supported by Fellowship EP/J018317/1 from the United Kingdom Engineering and Physical Sciences Research Council
    corecore