21,081 research outputs found

    Relatively Coherent Sets as a Hierarchical Partition Method

    Get PDF
    Finite time coherent sets [8] have recently been defined by a measure based objective function describing the degree that sets hold together, along with a Frobenius-Perron transfer operator method to produce optimally coherent sets. Here we present an extension to generalize the concept to hierarchially defined relatively coherent sets based on adjusting the finite time coherent sets to use relative mesure restricted to sets which are developed iteratively and hierarchically in a tree of partitions. Several examples help clarify the meaning and expectation of the techniques, as they are the nonautonomous double gyre, the standard map, an idealized stratospheric flow, and empirical data from the Mexico Gulf during the 2010 oil spill. Also for sake of analysis of computational complexity, we include an appendic concerning the computational complexity of developing the Ulam-Galerkin matrix extimates of the Frobenius-Perron operator centrally used here

    Identifying Finite-Time Coherent Sets from Limited Quantities of Lagrangian Data

    Full text link
    A data-driven procedure for identifying the dominant transport barriers in a time-varying flow from limited quantities of Lagrangian data is presented. Our approach partitions state space into pairs of coherent sets, which are sets of initial conditions chosen to minimize the number of trajectories that "leak" from one set to the other under the influence of a stochastic flow field during a pre-specified interval in time. In practice, this partition is computed by posing an optimization problem, which once solved, yields a pair of functions whose signs determine set membership. From prior experience with synthetic, "data rich" test problems and conceptually related methods based on approximations of the Perron-Frobenius operator, we observe that the functions of interest typically appear to be smooth. As a result, given a fixed amount of data our approach, which can use sets of globally supported basis functions, has the potential to more accurately approximate the desired functions than other functions tailored to use compactly supported indicator functions. This difference enables our approach to produce effective approximations of pairs of coherent sets in problems with relatively limited quantities of Lagrangian data, which is usually the case with real geophysical data. We apply this method to three examples of increasing complexity: the first is the double gyre, the second is the Bickley Jet, and the third is data from numerically simulated drifters in the Sulu Sea.Comment: 14 pages, 7 figure

    Information based clustering

    Full text link
    In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial assumptions about the structure of data. Here we reformulate the clustering problem from an information theoretic perspective which avoids many of these assumptions. In particular, our formulation obviates the need for defining a cluster "prototype", does not require an a priori similarity metric, is invariant to changes in the representation of the data, and naturally captures non-linear relations. We apply this approach to different domains and find that it consistently produces clusters that are more coherent than those extracted by existing algorithms. Finally, our approach provides a way of clustering based on collective notions of similarity rather than the traditional pairwise measures.Comment: To appear in Proceedings of the National Academy of Sciences USA, 11 pages, 9 figure

    Generalized Species Sampling Priors with Latent Beta reinforcements

    Full text link
    Many popular Bayesian nonparametric priors can be characterized in terms of exchangeable species sampling sequences. However, in some applications, exchangeability may not be appropriate. We introduce a {novel and probabilistically coherent family of non-exchangeable species sampling sequences characterized by a tractable predictive probability function with weights driven by a sequence of independent Beta random variables. We compare their theoretical clustering properties with those of the Dirichlet Process and the two parameters Poisson-Dirichlet process. The proposed construction provides a complete characterization of the joint process, differently from existing work. We then propose the use of such process as prior distribution in a hierarchical Bayes modeling framework, and we describe a Markov Chain Monte Carlo sampler for posterior inference. We evaluate the performance of the prior and the robustness of the resulting inference in a simulation study, providing a comparison with popular Dirichlet Processes mixtures and Hidden Markov Models. Finally, we develop an application to the detection of chromosomal aberrations in breast cancer by leveraging array CGH data.Comment: For correspondence purposes, Edoardo M. Airoldi's email is [email protected]; Federico Bassetti's email is [email protected]; Michele Guindani's email is [email protected] ; Fabrizo Leisen's email is [email protected]. To appear in the Journal of the American Statistical Associatio

    Geometry of the ergodic quotient reveals coherent structures in flows

    Full text link
    Dynamical systems that exhibit diverse behaviors can rarely be completely understood using a single approach. However, by identifying coherent structures in their state spaces, i.e., regions of uniform and simpler behavior, we could hope to study each of the structures separately and then form the understanding of the system as a whole. The method we present in this paper uses trajectory averages of scalar functions on the state space to: (a) identify invariant sets in the state space, (b) form coherent structures by aggregating invariant sets that are similar across multiple spatial scales. First, we construct the ergodic quotient, the object obtained by mapping trajectories to the space of trajectory averages of a function basis on the state space. Second, we endow the ergodic quotient with a metric structure that successfully captures how similar the invariant sets are in the state space. Finally, we parametrize the ergodic quotient using intrinsic diffusion modes on it. By segmenting the ergodic quotient based on the diffusion modes, we extract coherent features in the state space of the dynamical system. The algorithm is validated by analyzing the Arnold-Beltrami-Childress flow, which was the test-bed for alternative approaches: the Ulam's approximation of the transfer operator and the computation of Lagrangian Coherent Structures. Furthermore, we explain how the method extends the Poincar\'e map analysis for periodic flows. As a demonstration, we apply the method to a periodically-driven three-dimensional Hill's vortex flow, discovering unknown coherent structures in its state space. In the end, we discuss differences between the ergodic quotient and alternatives, propose a generalization to analysis of (quasi-)periodic structures, and lay out future research directions.Comment: Submitted to Elsevier Physica D: Nonlinear Phenomen

    Techniques for clustering gene expression data

    Get PDF
    Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered
    corecore