21 research outputs found

    Optimizing differential equations to fit data and predict outcomes

    Full text link
    Many scientific problems focus on observed patterns of change or on how to design a system to achieve particular dynamics. Those problems often require fitting differential equation models to target trajectories. Fitting such models can be difficult because each evaluation of the fit must calculate the distance between the model and target patterns at numerous points along a trajectory. The gradient of the fit with respect to the model parameters can be challenging. Recent technical advances in automatic differentiation through numerical differential equation solvers potentially change the fitting process into a relatively easy problem, opening up new possibilities to study dynamics. However, application of the new tools to real data may fail to achieve a good fit. This article illustrates how to overcome a variety of common challenges, using the classic ecological data for oscillations in hare and lynx populations. Models include simple ordinary differential equations (ODEs) and neural ordinary differential equations (NODEs), which use artificial neural networks to estimate the derivatives of differential equation systems. Comparing the fits obtained with ODEs versus NODEs, representing small and large parameter spaces, and changing the number of variable dimensions provide insight into the geometry of the observed and model trajectories. To analyze the quality of the models for predicting future observations, a Bayesian-inspired preconditioned stochastic gradient Langevin dynamics (pSGLD) calculation of the posterior distribution of predicted model trajectories clarifies the tendency for various models to underfit or overfit the data. Coupling fitted differential equation systems with pSGLD sampling provides a powerful way to study the properties of optimization surfaces, raising an analogy with mutation-selection dynamics on fitness landscapes

    Hybrid Process Models in Electrochemical Syntheses under Deep Uncertainty

    Get PDF
    Chemical process engineering and machine learning are merging rapidly, and hybrid process models have shown promising results in process analysis and process design. However, uncertainties in first-principles process models have an adverse effect on extrapolations and inferences based on hybrid process models. Parameter sensitivities are an essential tool to understand better the underlying uncertainty propagation and hybrid system identification challenges. Still, standard parameter sensitivity concepts may fail to address comprehensive parameter uncertainty problems, i.e., deep uncertainty with aleatoric and epistemic contributions. This work shows a highly effective and reproducible sampling strategy to calculate simulation uncertainties and global parameter sensitivities for hybrid process models under deep uncertainty. We demonstrate the workflow with two electrochemical synthesis simulation studies, including the synthesis of furfuryl alcohol and 4-aminophenol. Compared with Monte Carlo reference simulations, the CPU-time was significantly reduced. The general findings of the hybrid model sensitivity studies under deep uncertainty are twofold. First, epistemic uncertainty has a significant effect on uncertainty analysis. Second, the predicted parameter sensitivities of the hybrid process models add value to the interpretation and analysis of the hybrid models themselves but are not suitable for predicting the real process/full first-principles process model’s sensitivities

    Locally Regularized Neural Differential Equations: Some Black Boxes Were Meant to Remain Closed!

    Full text link
    Implicit layer deep learning techniques, like Neural Differential Equations, have become an important modeling framework due to their ability to adapt to new problems automatically. Training a neural differential equation is effectively a search over a space of plausible dynamical systems. However, controlling the computational cost for these models is difficult since it relies on the number of steps the adaptive solver takes. Most prior works have used higher-order methods to reduce prediction timings while greatly increasing training time or reducing both training and prediction timings by relying on specific training algorithms, which are harder to use as a drop-in replacement due to strict requirements on automatic differentiation. In this manuscript, we use internal cost heuristics of adaptive differential equation solvers at stochastic time points to guide the training toward learning a dynamical system that is easier to integrate. We "close the black-box" and allow the use of our method with any adjoint technique for gradient calculations of the differential equation solution. We perform experimental studies to compare our method to global regularization to show that we attain similar performance numbers without compromising the flexibility of implementation on ordinary differential equations (ODEs) and stochastic differential equations (SDEs). We develop two sampling strategies to trade off between performance and training time. Our method reduces the number of function evaluations to 0.556-0.733x and accelerates predictions by 1.3-2x

    ChronoMID-Cross-modal neural networks for 3-D temporal medical imaging data.

    Get PDF
    ChronoMID-neural networks for temporally-varying, hence Chrono, Medical Imaging Data-makes the novel application of cross-modal convolutional neural networks (X-CNNs) to the medical domain. In this paper, we present multiple approaches for incorporating temporal information into X-CNNs and compare their performance in a case study on the classification of abnormal bone remodelling in mice. Previous work developing medical models has predominantly focused on either spatial or temporal aspects, but rarely both. Our models seek to unify these complementary sources of information and derive insights in a bottom-up, data-driven approach. As with many medical datasets, the case study herein exhibits deep rather than wide data; we apply various techniques, including extensive regularisation, to account for this. After training on a balanced set of approximately 70000 images, two of the models-those using difference maps from known reference points-outperformed a state-of-the-art convolutional neural network baseline by over 30pp (> 99% vs. 68.26%) on an unseen, balanced validation set comprising around 20000 images. These models are expected to perform well with sparse data sets based on both previous findings with X-CNNs and the representations of time used, which permit arbitrarily large and irregular gaps between data points. Our results highlight the importance of identifying a suitable description of time for a problem domain, as unsuitable descriptors may not only fail to improve a model, they may in fact confound it
    corecore