10,245 research outputs found

    Batch kernel SOM and related Laplacian methods for social network analysis

    Get PDF
    Large graphs are natural mathematical models for describing the structure of the data in a wide variety of fields, such as web mining, social networks, information retrieval, biological networks, etc. For all these applications, automatic tools are required to get a synthetic view of the graph and to reach a good understanding of the underlying problem. In particular, discovering groups of tightly connected vertices and understanding the relations between those groups is very important in practice. This paper shows how a kernel version of the batch Self Organizing Map can be used to achieve these goals via kernels derived from the Laplacian matrix of the graph, especially when it is used in conjunction with more classical methods based on the spectral analysis of the graph. The proposed method is used to explore the structure of a medieval social network modeled through a weighted graph that has been directly built from a large corpus of agrarian contracts

    SOM-VAE: Interpretable Discrete Representation Learning on Time Series

    Full text link
    High-dimensional time series are common in many domains. Since human cognition is not optimized to work well in high-dimensional spaces, these areas could benefit from interpretable low-dimensional representations. However, most representation learning algorithms for time series data are difficult to interpret. This is due to non-intuitive mappings from data features to salient properties of the representation and non-smoothness over time. To address this problem, we propose a new representation learning framework building on ideas from interpretable discrete dimensionality reduction and deep generative modeling. This framework allows us to learn discrete representations of time series, which give rise to smooth and interpretable embeddings with superior clustering performance. We introduce a new way to overcome the non-differentiability in discrete representation learning and present a gradient-based version of the traditional self-organizing map algorithm that is more performant than the original. Furthermore, to allow for a probabilistic interpretation of our method, we integrate a Markov model in the representation space. This model uncovers the temporal transition structure, improves clustering performance even further and provides additional explanatory insights as well as a natural representation of uncertainty. We evaluate our model in terms of clustering performance and interpretability on static (Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST images, a chaotic Lorenz attractor system with two macro states, as well as on a challenging real world medical time series application on the eICU data set. Our learned representations compare favorably with competitor methods and facilitate downstream tasks on the real world data.Comment: Accepted for publication at the Seventh International Conference on Learning Representations (ICLR 2019

    How Many Dissimilarity/Kernel Self Organizing Map Variants Do We Need?

    Full text link
    In numerous applicative contexts, data are too rich and too complex to be represented by numerical vectors. A general approach to extend machine learning and data mining techniques to such data is to really on a dissimilarity or on a kernel that measures how different or similar two objects are. This approach has been used to define several variants of the Self Organizing Map (SOM). This paper reviews those variants in using a common set of notations in order to outline differences and similarities between them. It discusses the advantages and drawbacks of the variants, as well as the actual relevance of the dissimilarity/kernel SOM for practical applications

    Self-Organizing Time Map: An Abstraction of Temporal Multivariate Patterns

    Full text link
    This paper adopts and adapts Kohonen's standard Self-Organizing Map (SOM) for exploratory temporal structure analysis. The Self-Organizing Time Map (SOTM) implements SOM-type learning to one-dimensional arrays for individual time units, preserves the orientation with short-term memory and arranges the arrays in an ascending order of time. The two-dimensional representation of the SOTM attempts thus twofold topology preservation, where the horizontal direction preserves time topology and the vertical direction data topology. This enables discovering the occurrence and exploring the properties of temporal structural changes in data. For representing qualities and properties of SOTMs, we adapt measures and visualizations from the standard SOM paradigm, as well as introduce a measure of temporal structural changes. The functioning of the SOTM, and its visualizations and quality and property measures, are illustrated on artificial toy data. The usefulness of the SOTM in a real-world setting is shown on poverty, welfare and development indicators

    Information maps: tools for document exploration

    Get PDF

    Evaluating a Self-Organizing Map for Clustering and Visualizing Optimum Currency Area Criteria

    Get PDF
    Optimum currency area (OCA) theory attempts to define the geographical region in which it would maximize economic efficiency to have a single currency. In this paper, the focus is on prospective and current members of the Economic and Monetary Union. For this task, a self-organizing neural network, the Self-organizing map (SOM), is combined with hierarchical clustering for a two-level approach to clustering and visualizing OCA criteria. The output of the SOM is a topologically preserved two-dimensional grid. The final models are evaluated based on both clustering tendencies and accuracy measures. Thereafter, the two-dimensional grid of the chosen model is used for visual assessment of the OCA criteria, while its clustering results are projected onto a geographic map.Self-organizing maps, Optimum Currency Area, projection, clustering, geospatial visualization
    • 

    corecore