40 research outputs found
Efficient Localization of Discontinuities in Complex Computational Simulations
Surrogate models for computational simulations are input-output
approximations that allow computationally intensive analyses, such as
uncertainty propagation and inference, to be performed efficiently. When a
simulation output does not depend smoothly on its inputs, the error and
convergence rate of many approximation methods deteriorate substantially. This
paper details a method for efficiently localizing discontinuities in the input
parameter domain, so that the model output can be approximated as a piecewise
smooth function. The approach comprises an initialization phase, which uses
polynomial annihilation to assign function values to different regions and thus
seed an automated labeling procedure, followed by a refinement phase that
adaptively updates a kernel support vector machine representation of the
separating surface via active learning. The overall approach avoids structured
grids and exploits any available simplicity in the geometry of the separating
surface, thus reducing the number of model evaluations required to localize the
discontinuity. The method is illustrated on examples of up to eleven
dimensions, including algebraic models and ODE/PDE systems, and demonstrates
improved scaling and efficiency over other discontinuity localization
approaches
A continuous analogue of the tensor-train decomposition
We develop new approximation algorithms and data structures for representing
and computing with multivariate functions using the functional tensor-train
(FT), a continuous extension of the tensor-train (TT) decomposition. The FT
represents functions using a tensor-train ansatz by replacing the
three-dimensional TT cores with univariate matrix-valued functions. The main
contribution of this paper is a framework to compute the FT that employs
adaptive approximations of univariate fibers, and that is not tied to any
tensorized discretization. The algorithm can be coupled with any univariate
linear or nonlinear approximation procedure. We demonstrate that this approach
can generate multivariate function approximations that are several orders of
magnitude more accurate, for the same cost, than those based on the
conventional approach of compressing the coefficient tensor of a tensor-product
basis. Our approach is in the spirit of other continuous computation packages
such as Chebfun, and yields an algorithm which requires the computation of
"continuous" matrix factorizations such as the LU and QR decompositions of
vector-valued functions. To support these developments, we describe continuous
versions of an approximate maximum-volume cross approximation algorithm and of
a rounding algorithm that re-approximates an FT by one of lower ranks. We
demonstrate that our technique improves accuracy and robustness, compared to TT
and quantics-TT approaches with fixed parameterizations, of high-dimensional
integration, differentiation, and approximation of functions with local
features such as discontinuities and other nonlinearities
Bayesian System ID: Optimal management of parameter, model, and measurement uncertainty
We evaluate the robustness of a probabilistic formulation of system
identification (ID) to sparse, noisy, and indirect data. Specifically, we
compare estimators of future system behavior derived from the Bayesian
posterior of a learning problem to several commonly used least squares-based
optimization objectives used in system ID. Our comparisons indicate that the
log posterior has improved geometric properties compared with the objective
function surfaces of traditional methods that include differentially
constrained least squares and least squares reconstructions of discrete time
steppers like dynamic mode decomposition (DMD). These properties allow it to be
both more sensitive to new data and less affected by multiple minima ---
overall yielding a more robust approach. Our theoretical results indicate that
least squares and regularized least squares methods like dynamic mode
decomposition and sparse identification of nonlinear dynamics (SINDy) can be
derived from the probabilistic formulation by assuming noiseless measurements.
We also analyze the computational complexity of a Gaussian filter-based
approximate marginal Markov Chain Monte Carlo scheme that we use to obtain the
Bayesian posterior for both linear and nonlinear problems. We then empirically
demonstrate that obtaining the marginal posterior of the parameter dynamics and
making predictions by extracting optimal estimators (e.g., mean, median, mode)
yields orders of magnitude improvement over the aforementioned approaches. We
attribute this performance to the fact that the Bayesian approach captures
parameter, model, and measurement uncertainties, whereas the other methods
typically neglect at least one type of uncertainty
Robust identification of non-autonomous dynamical systems using stochastic dynamics models
This paper considers the problem of system identification (ID) of linear and
nonlinear non-autonomous systems from noisy and sparse data. We propose and
analyze an objective function derived from a Bayesian formulation for learning
a hidden Markov model with stochastic dynamics. We then analyze this objective
function in the context of several state-of-the-art approaches for both linear
and nonlinear system ID. In the former, we analyze least squares approaches for
Markov parameter estimation, and in the latter, we analyze the multiple
shooting approach. We demonstrate the limitations of the optimization problems
posed by these existing methods by showing that they can be seen as special
cases of the proposed optimization objective under certain simplifying
assumptions: conditional independence of data and zero model error.
Furthermore, we observe that our proposed approach has improved smoothness and
inherent regularization that make it well-suited for system ID and provide
mathematical explanations for these characteristics' origins. Finally,
numerical simulations demonstrate a mean squared error over 8.7 times lower
compared to multiple shooting when data are noisy and/or sparse. Moreover, the
proposed approach can identify accurate and generalizable models even when
there are more parameters than data or when the underlying system exhibits
chaotic behavior