365,697 research outputs found
Dynamic Tensor Clustering
Dynamic tensor data are becoming prevalent in numerous applications. Existing
tensor clustering methods either fail to account for the dynamic nature of the
data, or are inapplicable to a general-order tensor. Also there is often a gap
between statistical guarantee and computational efficiency for existing tensor
clustering solutions. In this article, we aim to bridge this gap by proposing a
new dynamic tensor clustering method, which takes into account both sparsity
and fusion structures, and enjoys strong statistical guarantees as well as high
computational efficiency. Our proposal is based upon a new structured tensor
factorization that encourages both sparsity and smoothness in parameters along
the specified tensor modes. Computationally, we develop a highly efficient
optimization algorithm that benefits from substantial dimension reduction. In
theory, we first establish a non-asymptotic error bound for the estimator from
the structured tensor factorization. Built upon this error bound, we then
derive the rate of convergence of the estimated cluster centers, and show that
the estimated clusters recover the true cluster structures with a high
probability. Moreover, our proposed method can be naturally extended to
co-clustering of multiple modes of the tensor data. The efficacy of our
approach is illustrated via simulations and a brain dynamic functional
connectivity analysis from an Autism spectrum disorder study.Comment: Accepted at Journal of the American Statistical Associatio
Isotropic Dynamic Hierarchical Clustering
We face a need of discovering a pattern in locations of a great number of
points in a high-dimensional space. Goal is to group the close points together.
We are interested in a hierarchical structure, like a B-tree. B-Trees are
hierarchical, balanced, and they can be constructed dynamically. B-Tree
approach allows to determine the structure without any supervised learning or a
priori knowlwdge. The space is Euclidean and isotropic. Unfortunately, there
are no B-Tree implementations processing indices in a symmetrical and
isotropical way. Some implementations are based on constructing compound
asymmetrical indices from point coordinates; and the others split the nodes
along the coordinate hyper-planes. We need to process tens of millions of
points in a thousand-dimensional space. The application has to be scalable.
Ideally, a cluster should be an ellipsoid, but it would require to store O(n2)
ellipse axes. So, we are using multi-dimensional balls defined by the centers
and radii. Calculation of statistical values like the mean and the average
deviation, can be done in an incremental way. While adding a point to a tree,
the statistical values for nodes recalculated in O(1) time. We support both,
brute force O(2n) and greedy O(n2) split algorithms. Statistical and aggregated
node information also allows to manipulate (to search, to delete) aggregated
sets of closely located points. Hierarchical information retrieval. When
searching, the user is provided with the highest appropriate nodes in the tree
hierarchy, with the most important clusters emerging in the hierarchy
automatically. Then, if interested, the user may navigate down the tree to more
specific points. The system is implemented as a library of Java classes
representing Points, Sets of points with aggregated statistical information,
B-tree, and Nodes with a support of serialization and storage in a MySQL
database.Comment: 6 pages with 3 example
An Ensemble Framework for Detecting Community Changes in Dynamic Networks
Dynamic networks, especially those representing social networks, undergo
constant evolution of their community structure over time. Nodes can migrate
between different communities, communities can split into multiple new
communities, communities can merge together, etc. In order to represent dynamic
networks with evolving communities it is essential to use a dynamic model
rather than a static one. Here we use a dynamic stochastic block model where
the underlying block model is different at different times. In order to
represent the structural changes expressed by this dynamic model the network
will be split into discrete time segments and a clustering algorithm will
assign block memberships for each segment. In this paper we show that using an
ensemble of clustering assignments accommodates for the variance in scalable
clustering algorithms and produces superior results in terms of
pairwise-precision and pairwise-recall. We also demonstrate that the dynamic
clustering produced by the ensemble can be visualized as a flowchart which
encapsulates the community evolution succinctly.Comment: 6 pages, under submission to HPEC Graph Challeng
A clustering particle swarm optimizer for locating and tracking multiple optima in dynamic environments
This article is posted here with permission from the IEEE - Copyright @ 2010 IEEEIn the real world, many optimization problems are dynamic. This requires an optimization algorithm to not only find the global optimal solution under a specific environment but also to track the trajectory of the changing optima over dynamic environments. To address this requirement, this paper investigates a clustering particle swarm optimizer (PSO) for dynamic optimization problems. This algorithm employs a hierarchical clustering method to locate and track multiple peaks. A fast local search method is also introduced to search optimal solutions in a promising subregion found by the clustering method. Experimental study is conducted based on the moving peaks benchmark to test the performance of the clustering PSO in comparison with several state-of-the-art algorithms from the literature. The experimental results show the efficiency of the clustering PSO for locating and tracking multiple optima in dynamic environments in comparison with other particle swarm optimization models based on the multiswarm method.This work was supported by the Engineering and Physical Sciences Research Council of U.K., under Grant EP/E060722/1
Macrostate Data Clustering
We develop an effective nonhierarchical data clustering method using an
analogy to the dynamic coarse graining of a stochastic system. Analyzing the
eigensystem of an interitem transition matrix identifies fuzzy clusters
corresponding to the metastable macroscopic states (macrostates) of a diffusive
system. A "minimum uncertainty criterion" determines the linear transformation
from eigenvectors to cluster-defining window functions. Eigenspectrum gap and
cluster certainty conditions identify the proper number of clusters. The
physically motivated fuzzy representation and associated uncertainty analysis
distinguishes macrostate clustering from spectral partitioning methods.
Macrostate data clustering solves a variety of test cases that challenge other
methods.Comment: keywords: cluster analysis, clustering, pattern recognition, spectral
graph theory, dynamic eigenvectors, machine learning, macrostates,
classificatio
- …