365,697 research outputs found

    Dynamic Tensor Clustering

    Full text link
    Dynamic tensor data are becoming prevalent in numerous applications. Existing tensor clustering methods either fail to account for the dynamic nature of the data, or are inapplicable to a general-order tensor. Also there is often a gap between statistical guarantee and computational efficiency for existing tensor clustering solutions. In this article, we aim to bridge this gap by proposing a new dynamic tensor clustering method, which takes into account both sparsity and fusion structures, and enjoys strong statistical guarantees as well as high computational efficiency. Our proposal is based upon a new structured tensor factorization that encourages both sparsity and smoothness in parameters along the specified tensor modes. Computationally, we develop a highly efficient optimization algorithm that benefits from substantial dimension reduction. In theory, we first establish a non-asymptotic error bound for the estimator from the structured tensor factorization. Built upon this error bound, we then derive the rate of convergence of the estimated cluster centers, and show that the estimated clusters recover the true cluster structures with a high probability. Moreover, our proposed method can be naturally extended to co-clustering of multiple modes of the tensor data. The efficacy of our approach is illustrated via simulations and a brain dynamic functional connectivity analysis from an Autism spectrum disorder study.Comment: Accepted at Journal of the American Statistical Associatio

    Isotropic Dynamic Hierarchical Clustering

    Get PDF
    We face a need of discovering a pattern in locations of a great number of points in a high-dimensional space. Goal is to group the close points together. We are interested in a hierarchical structure, like a B-tree. B-Trees are hierarchical, balanced, and they can be constructed dynamically. B-Tree approach allows to determine the structure without any supervised learning or a priori knowlwdge. The space is Euclidean and isotropic. Unfortunately, there are no B-Tree implementations processing indices in a symmetrical and isotropical way. Some implementations are based on constructing compound asymmetrical indices from point coordinates; and the others split the nodes along the coordinate hyper-planes. We need to process tens of millions of points in a thousand-dimensional space. The application has to be scalable. Ideally, a cluster should be an ellipsoid, but it would require to store O(n2) ellipse axes. So, we are using multi-dimensional balls defined by the centers and radii. Calculation of statistical values like the mean and the average deviation, can be done in an incremental way. While adding a point to a tree, the statistical values for nodes recalculated in O(1) time. We support both, brute force O(2n) and greedy O(n2) split algorithms. Statistical and aggregated node information also allows to manipulate (to search, to delete) aggregated sets of closely located points. Hierarchical information retrieval. When searching, the user is provided with the highest appropriate nodes in the tree hierarchy, with the most important clusters emerging in the hierarchy automatically. Then, if interested, the user may navigate down the tree to more specific points. The system is implemented as a library of Java classes representing Points, Sets of points with aggregated statistical information, B-tree, and Nodes with a support of serialization and storage in a MySQL database.Comment: 6 pages with 3 example

    An Ensemble Framework for Detecting Community Changes in Dynamic Networks

    Full text link
    Dynamic networks, especially those representing social networks, undergo constant evolution of their community structure over time. Nodes can migrate between different communities, communities can split into multiple new communities, communities can merge together, etc. In order to represent dynamic networks with evolving communities it is essential to use a dynamic model rather than a static one. Here we use a dynamic stochastic block model where the underlying block model is different at different times. In order to represent the structural changes expressed by this dynamic model the network will be split into discrete time segments and a clustering algorithm will assign block memberships for each segment. In this paper we show that using an ensemble of clustering assignments accommodates for the variance in scalable clustering algorithms and produces superior results in terms of pairwise-precision and pairwise-recall. We also demonstrate that the dynamic clustering produced by the ensemble can be visualized as a flowchart which encapsulates the community evolution succinctly.Comment: 6 pages, under submission to HPEC Graph Challeng

    A clustering particle swarm optimizer for locating and tracking multiple optima in dynamic environments

    Get PDF
    This article is posted here with permission from the IEEE - Copyright @ 2010 IEEEIn the real world, many optimization problems are dynamic. This requires an optimization algorithm to not only find the global optimal solution under a specific environment but also to track the trajectory of the changing optima over dynamic environments. To address this requirement, this paper investigates a clustering particle swarm optimizer (PSO) for dynamic optimization problems. This algorithm employs a hierarchical clustering method to locate and track multiple peaks. A fast local search method is also introduced to search optimal solutions in a promising subregion found by the clustering method. Experimental study is conducted based on the moving peaks benchmark to test the performance of the clustering PSO in comparison with several state-of-the-art algorithms from the literature. The experimental results show the efficiency of the clustering PSO for locating and tracking multiple optima in dynamic environments in comparison with other particle swarm optimization models based on the multiswarm method.This work was supported by the Engineering and Physical Sciences Research Council of U.K., under Grant EP/E060722/1

    Macrostate Data Clustering

    Full text link
    We develop an effective nonhierarchical data clustering method using an analogy to the dynamic coarse graining of a stochastic system. Analyzing the eigensystem of an interitem transition matrix identifies fuzzy clusters corresponding to the metastable macroscopic states (macrostates) of a diffusive system. A "minimum uncertainty criterion" determines the linear transformation from eigenvectors to cluster-defining window functions. Eigenspectrum gap and cluster certainty conditions identify the proper number of clusters. The physically motivated fuzzy representation and associated uncertainty analysis distinguishes macrostate clustering from spectral partitioning methods. Macrostate data clustering solves a variety of test cases that challenge other methods.Comment: keywords: cluster analysis, clustering, pattern recognition, spectral graph theory, dynamic eigenvectors, machine learning, macrostates, classificatio
    corecore