38,357 research outputs found
Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model
A novel method, based on the combination of data assimilation and machine learning is introduced. The new hybrid approach is designed for a two-fold scope: (i) emulating hidden, possibly chaotic, dynamics and (ii) predicting their future states. The method consists in applying iteratively a data assimilation step, here an ensemble Kalman filter, and a neural network. Data assimilation is used to optimally combine a surrogate model with sparse noisy data. The output analysis is spatially complete and is used as a training set by the neural network to update the surrogate model. The two steps are then repeated iteratively. Numerical experiments have been carried out using the chaotic 40-variables Lorenz 96 model, proving both convergence and statistical skill of the proposed hybrid approach. The surrogate model shows short-term forecast skill up to two Lyapunov times, the retrieval of positive Lyapunov exponents as well as the more energetic frequencies of the power density spectrum. The sensitivity of the method to critical setup parameters is also presented: the forecast skill decreases smoothly with increased observational noise but drops abruptly if less than half of the model domain is observed. The successful synergy between data assimilation and machine learning, proven here with a low-dimensional system, encourages further investigation of such hybrids with more sophisticated dynamics
The Parameter Houlihan: a solution to high-throughput identifiability indeterminacy for brutally ill-posed problems
One way to interject knowledge into clinically impactful forecasting is to
use data assimilation, a nonlinear regression that projects data onto a
mechanistic physiologic model, instead of a set of functions, such as neural
networks. Such regressions have an advantage of being useful with particularly
sparse, non-stationary clinical data. However, physiological models are often
nonlinear and can have many parameters, leading to potential problems with
parameter identifiability, or the ability to find a unique set of parameters
that minimize forecasting error. The identifiability problems can be minimized
or eliminated by reducing the number of parameters estimated, but reducing the
number of estimated parameters also reduces the flexibility of the model and
hence increases forecasting error. We propose a method, the parameter Houlihan,
that combines traditional machine learning techniques with data assimilation,
to select the right set of model parameters to minimize forecasting error while
reducing identifiability problems. The method worked well: the data
assimilation-based glucose forecasts and estimates for our cohort using the
Houlihan-selected parameter sets generally also minimize forecasting errors
compared to other parameter selection methods such as by-hand parameter
selection. Nevertheless, the forecast with the lowest forecast error does not
always accurately represent physiology, but further advancements of the
algorithm provide a path for improving physiologic fidelity as well. Our hope
is that this methodology represents a first step toward combining machine
learning with data assimilation and provides a lower-threshold entry point for
using data assimilation with clinical data by helping select the right
parameters to estimate
- …