23,241 research outputs found
The Implementation of Cluster Identification of Data
In this paper, we combine few transformations to create a single unique transformation
based on Threshold, Edge Detector, Simple Skeleton and Hough Line Transformation
with information-theoretic-based criteria for unsupervised hierarchical image-set
clustering. The continuous image modeling is based on mixture of Gaussian densities.
The unsupervised image-set clustering is based on a generalized version of a recently
introduced information-theoretic principle, the information bottleneck principle. Images
are clustered such that the mutual information between the clusters and the image
content is maximally preserved. Experimental results demonstrate the pattern of the
image skeleton. Information theoretic tools are used to evaluate cluster quality.
Particular emphasis is placed on the application of the clustering for efficient image
search and verification. The application is very suit to offer authentic-looking
counterfeit checks
Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems
To comprehend the hierarchical organization of large integrated systems, we
introduce the hierarchical map equation, which reveals multilevel structures in
networks. In this information-theoretic approach, we exploit the duality
between compression and pattern detection; by compressing a description of a
random walker as a proxy for real flow on a network, we find regularities in
the network that induce this system-wide flow. Finding the shortest multilevel
description of the random walker therefore gives us the best hierarchical
clustering of the network, the optimal number of levels and modular partition
at each level, with respect to the dynamics on the network. With a novel search
algorithm, we extract and illustrate the rich multilevel organization of
several large social and biological networks. For example, from the global air
traffic network we uncover countries and continents, and from the pattern of
scientific communication we reveal more than 100 scientific fields organized in
four major disciplines: life sciences, physical sciences, ecology and earth
sciences, and social sciences. In general, we find shallow hierarchical
structures in globally interconnected systems, such as neural networks, and
rich multilevel organizations in systems with highly separated regions, such as
road networks.Comment: 11 pages, 5 figures. For associated code, see
http://www.tp.umu.se/~rosvall/code.htm
Unsupervised Learning via Total Correlation Explanation
Learning by children and animals occurs effortlessly and largely without
obvious supervision. Successes in automating supervised learning have not
translated to the more ambiguous realm of unsupervised learning where goals and
labels are not provided. Barlow (1961) suggested that the signal that brains
leverage for unsupervised learning is dependence, or redundancy, in the sensory
environment. Dependence can be characterized using the information-theoretic
multivariate mutual information measure called total correlation. The principle
of Total Cor-relation Ex-planation (CorEx) is to learn representations of data
that "explain" as much dependence in the data as possible. We review some
manifestations of this principle along with successes in unsupervised learning
problems across diverse domains including human behavior, biology, and
language.Comment: Invited contribution for IJCAI 2017 Early Career Spotlight. 5 pages,
1 figur
- …