Lagrangian data assimilation exploits the trajectories of moving tracers as
observations to recover the underlying flow field. One major challenge in
Lagrangian data assimilation is the intrinsic nonlinearity that impedes using
exact Bayesian formulae for the state estimation of high-dimensional systems.
In this paper, an analytically tractable mathematical framework for
continuous-in-time Lagrangian data assimilation is developed. It preserves the
nonlinearity in the observational processes while approximating the forecast
model of the underlying flow field using linear stochastic models (LSMs). A
critical feature of the framework is that closed analytic formulae are
available for solving the posterior distribution, which facilitates
mathematical analysis and numerical simulations. First, an efficient iterative
algorithm is developed in light of the analytically tractable statistics. It
accurately estimates the parameters in the LSMs using only a small number of
the observed tracer trajectories. Next, the framework facilitates the
development of several computationally efficient approximate filters and the
quantification of the associated uncertainties. A cheap approximate filter with
a diagonal posterior covariance derived from the asymptotic analysis of the
posterior estimate is shown to be skillful in recovering incompressible flows.
It is also demonstrated that randomly selecting a small number of tracers at
each time step as observations can reduce the computational cost while
retaining the data assimilation accuracy. Finally, based on a prototype model
in geophysics, the framework with LSMs is shown to be skillful in filtering
nonlinear turbulent flow fields with strong non-Gaussian features