15,599 research outputs found
Adaptive Evolutionary Clustering
In many practical applications of clustering, the objects to be clustered
evolve over time, and a clustering result is desired at each time step. In such
applications, evolutionary clustering typically outperforms traditional static
clustering by producing clustering results that reflect long-term trends while
being robust to short-term variations. Several evolutionary clustering
algorithms have recently been proposed, often by adding a temporal smoothness
penalty to the cost function of a static clustering method. In this paper, we
introduce a different approach to evolutionary clustering by accurately
tracking the time-varying proximities between objects followed by static
clustering. We present an evolutionary clustering framework that adaptively
estimates the optimal smoothing parameter using shrinkage estimation, a
statistical approach that improves a naive estimate using additional
information. The proposed framework can be used to extend a variety of static
clustering algorithms, including hierarchical, k-means, and spectral
clustering, into evolutionary clustering algorithms. Experiments on synthetic
and real data sets indicate that the proposed framework outperforms static
clustering and existing evolutionary clustering algorithms in many scenarios.Comment: To appear in Data Mining and Knowledge Discovery, MATLAB toolbox
available at http://tbayes.eecs.umich.edu/xukevin/affec
Recommended from our members
The Evolution of Language Groups among Cooperating Digital Predators
Many species of animals have evolved complex means for communicating with one another. Oftentimes, communication is essential for the execution of tasks that require cooperation between individuals, such as group hunting and mate selection. As a result, communication itself becomes essential for survival. While these facts are readily observed, the evolutionary processes underlying them are less understood, in large part because observational - much less controlled - studies of these processes are impossible. Both the timescales and population sizes required for such studies are simply too great.
To address these problems, this thesis uses simulated predators to study the evolution of language in animals. These digital predators evolve to perform two cooperative tasks: hunting and mate selection. After the populations of predators have evolved to perform both tasks successfully, the population is decomposed into both language groups and cooperative groups. Spectral clustering identifies predators that speak similar languages, while merge clustering is used to find those groups of predators that are the most successful when working together.
Analysis of the groups generated by these two different methods shows that the most successful pairings are not necessarily those in which the two individuals are speaking the same language. Rather, organisms can evolve to speak a different language than the one to which they respond. Moreover, even though one task -- mate selection -- evolves earlier in evolutionary history, the language diversity it produces counteracts any head-start provided for the evolution of the second task. Thus, not only is language important for the evolution of cooperative task success, but the appearance of language groups can also play a determinant role in the evolution of cooperation.Computer Science
Hierarchical information clustering by means of topologically embedded graphs
We introduce a graph-theoretic approach to extract clusters and hierarchies
in complex data-sets in an unsupervised and deterministic manner, without the
use of any prior information. This is achieved by building topologically
embedded networks containing the subset of most significant links and analyzing
the network structure. For a planar embedding, this method provides both the
intra-cluster hierarchy, which describes the way clusters are composed, and the
inter-cluster hierarchy which describes how clusters gather together. We
discuss performance, robustness and reliability of this method by first
investigating several artificial data-sets, finding that it can outperform
significantly other established approaches. Then we show that our method can
successfully differentiate meaningful clusters and hierarchies in a variety of
real data-sets. In particular, we find that the application to gene expression
patterns of lymphoma samples uncovers biologically significant groups of genes
which play key-roles in diagnosis, prognosis and treatment of some of the most
relevant human lymphoid malignancies
- …