21,081 research outputs found
Relatively Coherent Sets as a Hierarchical Partition Method
Finite time coherent sets [8] have recently been defined by a measure based
objective function describing the degree that sets hold together, along with a
Frobenius-Perron transfer operator method to produce optimally coherent sets.
Here we present an extension to generalize the concept to hierarchially defined
relatively coherent sets based on adjusting the finite time coherent sets to
use relative mesure restricted to sets which are developed iteratively and
hierarchically in a tree of partitions. Several examples help clarify the
meaning and expectation of the techniques, as they are the nonautonomous double
gyre, the standard map, an idealized stratospheric flow, and empirical data
from the Mexico Gulf during the 2010 oil spill. Also for sake of analysis of
computational complexity, we include an appendic concerning the computational
complexity of developing the Ulam-Galerkin matrix extimates of the
Frobenius-Perron operator centrally used here
Identifying Finite-Time Coherent Sets from Limited Quantities of Lagrangian Data
A data-driven procedure for identifying the dominant transport barriers in a
time-varying flow from limited quantities of Lagrangian data is presented. Our
approach partitions state space into pairs of coherent sets, which are sets of
initial conditions chosen to minimize the number of trajectories that "leak"
from one set to the other under the influence of a stochastic flow field during
a pre-specified interval in time. In practice, this partition is computed by
posing an optimization problem, which once solved, yields a pair of functions
whose signs determine set membership. From prior experience with synthetic,
"data rich" test problems and conceptually related methods based on
approximations of the Perron-Frobenius operator, we observe that the functions
of interest typically appear to be smooth. As a result, given a fixed amount of
data our approach, which can use sets of globally supported basis functions,
has the potential to more accurately approximate the desired functions than
other functions tailored to use compactly supported indicator functions. This
difference enables our approach to produce effective approximations of pairs of
coherent sets in problems with relatively limited quantities of Lagrangian
data, which is usually the case with real geophysical data. We apply this
method to three examples of increasing complexity: the first is the double
gyre, the second is the Bickley Jet, and the third is data from numerically
simulated drifters in the Sulu Sea.Comment: 14 pages, 7 figure
Information based clustering
In an age of increasingly large data sets, investigators in many different
disciplines have turned to clustering as a tool for data analysis and
exploration. Existing clustering methods, however, typically depend on several
nontrivial assumptions about the structure of data. Here we reformulate the
clustering problem from an information theoretic perspective which avoids many
of these assumptions. In particular, our formulation obviates the need for
defining a cluster "prototype", does not require an a priori similarity metric,
is invariant to changes in the representation of the data, and naturally
captures non-linear relations. We apply this approach to different domains and
find that it consistently produces clusters that are more coherent than those
extracted by existing algorithms. Finally, our approach provides a way of
clustering based on collective notions of similarity rather than the
traditional pairwise measures.Comment: To appear in Proceedings of the National Academy of Sciences USA, 11
pages, 9 figure
Generalized Species Sampling Priors with Latent Beta reinforcements
Many popular Bayesian nonparametric priors can be characterized in terms of
exchangeable species sampling sequences. However, in some applications,
exchangeability may not be appropriate. We introduce a {novel and
probabilistically coherent family of non-exchangeable species sampling
sequences characterized by a tractable predictive probability function with
weights driven by a sequence of independent Beta random variables. We compare
their theoretical clustering properties with those of the Dirichlet Process and
the two parameters Poisson-Dirichlet process. The proposed construction
provides a complete characterization of the joint process, differently from
existing work. We then propose the use of such process as prior distribution in
a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte
Carlo sampler for posterior inference. We evaluate the performance of the prior
and the robustness of the resulting inference in a simulation study, providing
a comparison with popular Dirichlet Processes mixtures and Hidden Markov
Models. Finally, we develop an application to the detection of chromosomal
aberrations in breast cancer by leveraging array CGH data.Comment: For correspondence purposes, Edoardo M. Airoldi's email is
[email protected]; Federico Bassetti's email is
[email protected]; Michele Guindani's email is
[email protected] ; Fabrizo Leisen's email is
[email protected]. To appear in the Journal of the American
Statistical Associatio
Geometry of the ergodic quotient reveals coherent structures in flows
Dynamical systems that exhibit diverse behaviors can rarely be completely
understood using a single approach. However, by identifying coherent structures
in their state spaces, i.e., regions of uniform and simpler behavior, we could
hope to study each of the structures separately and then form the understanding
of the system as a whole. The method we present in this paper uses trajectory
averages of scalar functions on the state space to: (a) identify invariant sets
in the state space, (b) form coherent structures by aggregating invariant sets
that are similar across multiple spatial scales. First, we construct the
ergodic quotient, the object obtained by mapping trajectories to the space of
trajectory averages of a function basis on the state space. Second, we endow
the ergodic quotient with a metric structure that successfully captures how
similar the invariant sets are in the state space. Finally, we parametrize the
ergodic quotient using intrinsic diffusion modes on it. By segmenting the
ergodic quotient based on the diffusion modes, we extract coherent features in
the state space of the dynamical system. The algorithm is validated by
analyzing the Arnold-Beltrami-Childress flow, which was the test-bed for
alternative approaches: the Ulam's approximation of the transfer operator and
the computation of Lagrangian Coherent Structures. Furthermore, we explain how
the method extends the Poincar\'e map analysis for periodic flows. As a
demonstration, we apply the method to a periodically-driven three-dimensional
Hill's vortex flow, discovering unknown coherent structures in its state space.
In the end, we discuss differences between the ergodic quotient and
alternatives, propose a generalization to analysis of (quasi-)periodic
structures, and lay out future research directions.Comment: Submitted to Elsevier Physica D: Nonlinear Phenomen
Techniques for clustering gene expression data
Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered
- …