298 research outputs found
Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck operators
This paper presents a diffusion based probabilistic interpretation of
spectral clustering and dimensionality reduction algorithms that use the
eigenvectors of the normalized graph Laplacian. Given the pairwise adjacency
matrix of all points, we define a diffusion distance between any two data
points and show that the low dimensional representation of the data by the
first few eigenvectors of the corresponding Markov matrix is optimal under a
certain mean squared error criterion. Furthermore, assuming that data points
are random samples from a density p(\x) = e^{-U(\x)} we identify these
eigenvectors as discrete approximations of eigenfunctions of a Fokker-Planck
operator in a potential 2U(\x) with reflecting boundary conditions. Finally,
applying known results regarding the eigenvalues and eigenfunctions of the
continuous Fokker-Planck operator, we provide a mathematical justification for
the success of spectral clustering and dimensional reduction algorithms based
on these first few eigenvectors. This analysis elucidates, in terms of the
characteristics of diffusion processes, many empirical findings regarding
spectral clustering algorithms.Comment: submitted to NIPS 200
Generalised Ornstein-Uhlenbeck processes
We solve a physically significant extension of a classic problem in the
theory of diffusion, namely the Ornstein-Uhlenbeck process [G. E. Ornstein and
L. S. Uhlenbeck, Phys. Rev. 36, 823, (1930)]. Our generalised
Ornstein-Uhlenbeck systems include a force which depends upon the position of
the particle, as well as upon time. They exhibit anomalous diffusion at short
times, and non-Maxwellian velocity distributions in equilibrium. Two approaches
are used. Some statistics are obtained from a closed-form expression for the
propagator of the Fokker-Planck equation for the case where the particle is
initially at rest. In the general case we use spectral decomposition of a
Fokker-Planck equation, employing nonlinear creation and annihilation operators
to generate the spectrum which consists of two staggered ladders.Comment: 24 pages, 2 figure
Nonparametric Uncertainty Quantification for Stochastic Gradient Flows
This paper presents a nonparametric statistical modeling method for
quantifying uncertainty in stochastic gradient systems with isotropic
diffusion. The central idea is to apply the diffusion maps algorithm to a
training data set to produce a stochastic matrix whose generator is a discrete
approximation to the backward Kolmogorov operator of the underlying dynamics.
The eigenvectors of this stochastic matrix, which we will refer to as the
diffusion coordinates, are discrete approximations to the eigenfunctions of the
Kolmogorov operator and form an orthonormal basis for functions defined on the
data set. Using this basis, we consider the projection of three uncertainty
quantification (UQ) problems (prediction, filtering, and response) into the
diffusion coordinates. In these coordinates, the nonlinear prediction and
response problems reduce to solving systems of infinite-dimensional linear
ordinary differential equations. Similarly, the continuous-time nonlinear
filtering problem reduces to solving a system of infinite-dimensional linear
stochastic differential equations. Solving the UQ problems then reduces to
solving the corresponding truncated linear systems in finitely many diffusion
coordinates. By solving these systems we give a model-free algorithm for UQ on
gradient flow systems with isotropic diffusion. We numerically verify these
algorithms on a 1-dimensional linear gradient flow system where the analytic
solutions of the UQ problems are known. We also apply the algorithm to a
chaotically forced nonlinear gradient flow system which is known to be well
approximated as a stochastically forced gradient flow.Comment: Find the associated videos at: http://personal.psu.edu/thb11
Data-driven model reduction and transfer operator approximation
In this review paper, we will present different data-driven dimension
reduction techniques for dynamical systems that are based on transfer operator
theory as well as methods to approximate transfer operators and their
eigenvalues, eigenfunctions, and eigenmodes. The goal is to point out
similarities and differences between methods developed independently by the
dynamical systems, fluid dynamics, and molecular dynamics communities such as
time-lagged independent component analysis (TICA), dynamic mode decomposition
(DMD), and their respective generalizations. As a result, extensions and best
practices developed for one particular method can be carried over to other
related methods
Variable-free exploration of stochastic models: a gene regulatory network example
Finding coarse-grained, low-dimensional descriptions is an important task in
the analysis of complex, stochastic models of gene regulatory networks. This
task involves (a) identifying observables that best describe the state of these
complex systems and (b) characterizing the dynamics of the observables. In a
previous paper [13], we assumed that good observables were known a priori, and
presented an equation-free approach to approximate coarse-grained quantities
(i.e, effective drift and diffusion coefficients) that characterize the
long-time behavior of the observables. Here we use diffusion maps [9] to
extract appropriate observables ("reduction coordinates") in an automated
fashion; these involve the leading eigenvectors of a weighted Laplacian on a
graph constructed from network simulation data. We present lifting and
restriction procedures for translating between physical variables and these
data-based observables. These procedures allow us to perform equation-free
coarse-grained, computations characterizing the long-term dynamics through the
design and processing of short bursts of stochastic simulation initialized at
appropriate values of the data-based observables.Comment: 26 pages, 9 figure
Transition manifolds of complex metastable systems: Theory and data-driven computation of effective dynamics
We consider complex dynamical systems showing metastable behavior but no
local separation of fast and slow time scales. The article raises the question
of whether such systems exhibit a low-dimensional manifold supporting its
effective dynamics. For answering this question, we aim at finding nonlinear
coordinates, called reaction coordinates, such that the projection of the
dynamics onto these coordinates preserves the dominant time scales of the
dynamics. We show that, based on a specific reducibility property, the
existence of good low-dimensional reaction coordinates preserving the dominant
time scales is guaranteed. Based on this theoretical framework, we develop and
test a novel numerical approach for computing good reaction coordinates. The
proposed algorithmic approach is fully local and thus not prone to the curse of
dimension with respect to the state space of the dynamics. Hence, it is a
promising method for data-based model reduction of complex dynamical systems
such as molecular dynamics
Diffusion maps tailored to arbitrary non-degenerate Ito processes
We present two generalizations of the popular diffusion maps algorithm. The first generalization replaces the drift term in diffusion maps, which is the gradient of the sampling density, with the gradient of an arbitrary density of interest which is known up to a normalization constant. The second generalization allows for a diffusion map type approximation of the forward and backward generators of general Ito diffusions with given drift and diffusion coefficients. We use the local kernels introduced by Berry and Sauer, but allow for arbitrary sampling densities. We provide numerical illustrations to demonstrate that this opens up many new applications for diffusion maps as a tool to organize point cloud data, including biased or corrupted samples, dimension reduction for dynamical systems, detection of almost invariant regions in flow fields, and importance sampling
Coarse-grained dynamics of an activity bump in a neural field model
We study a stochastic nonlocal PDE, arising in the context of modelling
spatially distributed neural activity, which is capable of sustaining
stationary and moving spatially-localized ``activity bumps''. This system is
known to undergo a pitchfork bifurcation in bump speed as a parameter (the
strength of adaptation) is changed; yet increasing the noise intensity
effectively slowed the motion of the bump. Here we revisit the system from the
point of view of describing the high-dimensional stochastic dynamics in terms
of the effective dynamics of a single scalar "coarse" variable. We show that
such a reduced description in the form of an effective Langevin equation
characterized by a double-well potential is quantitatively successful. The
effective potential can be extracted using short, appropriately-initialized
bursts of direct simulation. We demonstrate this approach in terms of (a) an
experience-based "intelligent" choice of the coarse observable and (b) an
observable obtained through data-mining direct simulation results, using a
diffusion map approach.Comment: Corrected aknowledgement
Geometrical Methods for the Analysis of Simulation Bundles
Efficiently analyzing large amounts of high dimensional data derived from the simulation of industrial products is a challenge that is confronted in this thesis. For this purpose, simulations are considered as abstract objects and assumed to be living in lower dimensional space. The aim of this thesis is to characterize and analyze these simulations, this is done by examining two different approaches. Firstly, from the perspective of manifold learning using diffusion maps and demonstrating its application and merits; the inherent assumption of manifold learning is that high dimensional data can be considered to be located on a low dimensional abstract manifold. Unfortunately, this can not be verified in practical applications as it would require the existence of several thousand datasets, where in reality only a few hundred are available due to computational costs. To overcome these restrictions, a new way of characterizing the set of simulations is proposed where it is assumed that transformations send simulations to other simulations. Under this assumption, the theoretical framework of shape spaces can be applied wherein a quotient space of a pre-shape space (the space of simulations shapes) modulo a transformation group is used. It is propound to add into this setting, the construction of positive definite operators that are assumed invariant to specific transformations. They are built using only one simulation and as a consequence all other simulations can be projected to the eigen-basis of these operators. A new representation of all simulations is thus obtained based on the projection coefficients in a very much analogous way to the use of the Fourier transformation. The new representation is shown to be significantly reduced, depending on the smoothness of the data. Several industrial applications for time dependent datasets from engineering simulations are provided to demonstrate the usefulness of the method and put forward several research directions and possible new applications
- …