61 research outputs found
Data-driven modelling of biological multi-scale processes
Biological processes involve a variety of spatial and temporal scales. A
holistic understanding of many biological processes therefore requires
multi-scale models which capture the relevant properties on all these scales.
In this manuscript we review mathematical modelling approaches used to describe
the individual spatial scales and how they are integrated into holistic models.
We discuss the relation between spatial and temporal scales and the implication
of that on multi-scale modelling. Based upon this overview over
state-of-the-art modelling approaches, we formulate key challenges in
mathematical and computational modelling of biological multi-scale and
multi-physics processes. In particular, we considered the availability of
analysis tools for multi-scale models and model-based multi-scale data
integration. We provide a compact review of methods for model-based data
integration and model-based hypothesis testing. Furthermore, novel approaches
and recent trends are discussed, including computation time reduction using
reduced order and surrogate models, which contribute to the solution of
inference problems. We conclude the manuscript by providing a few ideas for the
development of tailored multi-scale inference methods.Comment: This manuscript will appear in the Journal of Coupled Systems and
Multiscale Dynamics (American Scientific Publishers
Input estimation for extended-release formulations exemplified with exenatide
Estimating the in vivo absorption profile of a drug is essential when developing extended-release medications. Such estimates can be obtained by measuring plasma concentrations over time and inferring the absorption from a model of the drug’s pharmacokinetics. Of particular interest is to predict the bioavailability—the fraction of the drug that is absorbed and enters the systemic circulation. This paper presents a framework for addressing this class of estimation problems and gives advice on the choice of method. In parametric methods, a model is constructed for the absorption process, which can be difficult when the absorption has a complicated profile. Here, we place emphasis on non-parametric methods that avoid making strong assumptions about the absorption. A modern estimation method that can address very general input-estimation problems has previously been presented. In this method, the absorption profile is modeled as a stochastic process, which is estimated using Markov chain Monte Carlo techniques. The applicability of this method for extended-release formulation development is evaluated by analyzing a dataset of Bydureon, an injectable extended-release suspension formulation of exenatide, a GLP-1 receptor agonist for treating diabetes. This drug is known to have non-linear pharmacokinetics. Its plasma concentration profile exhibits multiple peaks, something that can make parametric modeling challenging, but poses no major difficulties for non-parametric methods. The method is also validated on synthetic data, exploring the effects of sampling and noise on the accuracy of the estimates
Bayesian inference for stochastic differential mixed-effects models
PhD ThesisStochastic differential equations (SDEs) provide a natural framework for modelling intrinsic
stochasticity inherent in many continuous-time physical processes. When such
processes are observed in multiple individuals or experimental units, SDE driven mixed- effects models allow the quantification of both between and within individual variation.
Performing Bayesian inference for such models, using discrete-time data that may be incomplete
and subject to measurement error, is a challenging problem and is the focus of
this thesis.
Since, in general, no closed form expression exists for the transition densities of the SDE
of interest, a widely adopted solution works with the Euler-Maruyama approximation,
by replacing the intractable transition densities with Gaussian approximations. These
approximations can be made arbitrarily accurate by introducing intermediate time-points
between observations. Integrating over the uncertainty associated with the process at these
time-points necessitates the use of computationally intensive algorithms such as Markov
chain Monte Carlo (MCMC).
We extend a recently proposed MCMC scheme to include the SDE driven mixed-effects
framework. Key to the development of an e fficient inference scheme is the ability to
generate discrete-time realisations of the latent process between observation times. Such
realisations are typically termed diffusion bridges. By partitioning the SDE into two parts,
one that accounts for nonlinear dynamics in a deterministic way, and another as a residual
stochastic process, we develop a class of novel constructs that bridge the residual process
via a linear approximation. In addition, we adapt a recently proposed construct to a partial
and noisy observation regime. We compare the performance of each new construct with a
number of existing approaches, using three applications: a simple birth-death process, a
Lotka-Volterra model and a model for aphid growth.
We incorporate the best performing bridge construct within an MCMC scheme to determine
the posterior distribution of the model parameters. This methodology is then
applied to synthetic data generated from a simple SDE model of orange tree growth, and
real data consisting of observations on aphid numbers recorded under a variety of different
treatment regimes. Finally, we provide a systematic comparison of our approach with an
inference scheme based on a tractable approximation of the SDE, that is, the linear noise
approximatio
STUDY DESIGN AND METHODS FOR EVALUATING SUSTAINED UNRESPONSIVENESS TO PEANUT SUBLINGUAL IMMUNOTHERAPY
The length of time off-therapy that would represent clinically meaningful sustained unresponsiveness (SU) to peanut allergen remains undefined. Our work has three-fold objectives: first, to delineate aspects of the altered clinical trial design that would allow us to assess effectiveness of sublingual immunotherapy (SLIT) in achieving SU; second, to discuss methodology for evaluating the time to loss of SU and associated risk factors in context of the proposed study design; finally, to develop a flexible methodology for assessing mean reverting threshold and prognosis of SU failure in the presence of study risk factors. Failure refers to the loss of SU upon therapy cessation in peanut allergic children who are administered sublingual immunotherapy (SLIT).
The salient feature of the new design is the allocation scheme of study subjects to staggered sampling timepoints following therapy suspension when a subsequent food challenge is administered. Due to a fixed sequence of increasing allergen doses administered in a challenge-test, the subject’s true threshold at either occasion is interval-censored. Additionally, due to the timing of subsequent DBPCFC, the time to loss of SU for subjects who pass the DBPCFC at study entry is either left- or right-censored. In this thesis, we elaborate on the features of the study design, develop and extensively validate methods to evaluate study end points and discuss their potential to inform individualized treatments.
The thesis is compartmentalized as follows: (i) an innovative clinical trial design that aims at studying SU to SLIT; (ii) a newly developed mixture proportional hazards model for evaluating the time to loss of SU in context of the study generated interval-censored data subject to instantaneous failures; (iii) a time-dependent Ornstein Uhlenbeck (OU) diffusion process for modeling immunologic SU degradation trajectories using stochastic differential mixed effect model (SDMEM) framework; (iv) the estimation of mean-reverting threshold and prognosis of the loss of SU; (v) lastly, the clinical implementation and future scope of work. Through this work, we are presented with an opportunity to dedicate these inter-connected parts to three core issues of failure: model description, prediction and prevention.Doctor of Public Healt
Recommended from our members
Modernizing Markov Chains Monte Carlo for Scientific and Bayesian Modeling
The advent of probabilistic programming languages has galvanized scientists to write increasingly diverse models to analyze data. Probabilistic models use a joint distribution over observed and latent variables to describe at once elaborate scientific theories, non-trivial measurement procedures, information from previous studies, and more. To effectively deploy these models in a data analysis, we need inference procedures which are reliable, flexible, and fast. In a Bayesian analysis, inference boils down to estimating the expectation values and quantiles of the unnormalized posterior distribution. This estimation problem also arises in the study of non-Bayesian probabilistic models, a prominent example being the Ising model of Statistical Physics.
Markov chains Monte Carlo (MCMC) algorithms provide a general-purpose sampling method which can be used to construct sample estimators of moments and quantiles. Despite MCMC’s compelling theory and empirical success, many models continue to frustrate MCMC, as well as other inference strategies, effectively limiting our ability to use these models in a data analysis. These challenges motivate new developments in MCMC. The term “modernize” in the title refers to the deployment of methods which have revolutionized Computational Statistics and Machine Learning in the past decade, including: (i) hardware accelerators to support massive parallelization, (ii) approximate inference based on tractable densities, (iii) high-performance automatic differentiation and (iv) continuous relaxations of discrete systems.
The growing availability of hardware accelerators such as GPUs has in the past years motivated a general MCMC strategy, whereby we run many chains in parallel with a short sampling phase, rather than a few chains with a long sampling phase. Unfortunately existing convergence diagnostics are not designed for the “many short chains” regime. This is notably the case of the popular R statistics which claims convergence only if the effective sample size per chain is large. We present the nested R, denoted nR, a generalization of R which does not conflate short chains and poor mixing, and offers a useful diagnostic provided we run enough chains and meet certain initialization conditions. Combined with nR the short chain regime presents us with the opportunity to identify optimal lengths for the warmup and sampling phases, as well as the optimal number of chains; tuning parameters of MCMC which are otherwise chosen using heuristics or trial-and-error.
We next focus on semi-specialized algorithms for latent Gaussian models, arguably the most widely used of class of hierarchical models. It is well understood that MCMC often struggles with the geometry of the posterior distribution generated by these models. Using a Laplace approximation, we marginalize out the latent Gaussian variables and then integrate the remaining parameters with Hamiltonian Monte Carlo (HMC), a gradient-based MCMC. This approach combines MCMC and a distributional approximation, and offers a useful alternative to pure MCMC or pure approximation methods such as Variational Inference. We compare the three paradigms across a range of general linear models, which admit a sophisticated prior, i.e. a Gaussian process and a Horseshoe prior. To implement our scheme efficiently, we derive a novel automatic differentiation method called the adjoint-differentiated Laplace approximation. This differentiation algorithm propagates the minimal information needed to construct the gradient of the approximate marginal likelihood, and yields a scalable differentiation method that is orders of magnitude faster than state of the art differentiation for high-dimensional hyperparameters. We next discuss the application of our algorithm to models with an unconventional likelihood, going beyond the classical setting of general linear models. This necessitates a non-trivial generalization of the adjoint-differentiated Laplace approximation, which we implement using higher-order adjoint methods. The generalization works out to be both more general and more efficient. We apply the resulting method to an unconventional latent Gaussian model, identifying promising features and highlighting persistent challenges.
The final chapter of this dissertation focuses on a specific but rich problem: the Ising model of Statistical Physics, and its generalization as the Potts and Spin Glass models. These models are challenging because they are discrete, precluding the immediate use of gradient-based algorithms, and exhibit multiple modes, notably at cold temperatures. We propose a new class of MCMC algorithms to draw samples from Potts models by augmenting the target space with a carefully constructed auxiliary Gaussian variable. In contrast to existing methods of a similar flavor, our algorithm can take advantage of the low-rank structure of the coupling matrix and scales linearly with the number of states in a Potts model. The method is applied to a broad range of coupling and temperature regimes and compared to several sampling methods, allowing us to paint a nuanced algorithmic landscape
- …