563 research outputs found

    On-line relational SOM for dissimilarity data

    No full text
    International audienceIn some applications and in order to address real world situations better, data may be more complex than simple vectors. In some examples, they can be known through their pairwise dissimilarities only. Several variants of the Self Organizing Map algorithm were introduced to generalize the original algorithm to this framework. Whereas median SOM is based on a rough representation of the prototypes, relational SOM allows representing these prototypes by a virtual combination of all elements in the data set. However, this latter approach suffers from two main drawbacks. First, its complexity can be large. Second, only a batch version of this algorithm has been studied so far and it often provides results having a bad topographic organization. In this article, an on-line version of relational SOM is described and justified. The algorithm is tested on several datasets, including categorical data and graphs, and compared with the batch version and with other SOM algorithms for non vector data

    Self-Organizing Maps for clustering and visualization of bipartite graphs

    No full text
    National audienceGraphs (also frequently called networks) have attracted a burst of attention in the last years, with applications to social science, biology, computer science... The present paper proposes a data mining method for visualizing and clustering the nodes of a peculiar class of graphs: bipartite graphs. The method is based on a self-organizing map algorithm and relies on an extension of this approach to data described by a dissimilarity matrix

    Which dissimilarity is to be used when extracting typologies in sequence analysis? A comparative study

    No full text
    International audienceOriginally developed in bioinformatics, sequence analysis is being increasingly used in social sciences for the study of life-course processes. The methodology generally employed consists in computing dissimilarities between the trajectories and, if typologies are sought, in clustering the trajectories according to their similarities or dissemblances. The choice of an appropriate dissimilarity measure is a major issue when dealing with sequence analysis for life sequences. Several dissimilarities are available in the literature, but neither of them succeeds to become indisputable. In this paper, instead of deciding upon one dissimilarity measure, we propose to use an optimal convex combination of different dissimilarities. The optimality is automatically determined by the clustering procedure and is defined with respect to the within-class variance

    Multiple dissimilarity SOM for clustering and visualizing graphs with node and edge attributes

    No full text
    International audienceWhen wanting to understand the way a graph G is structured and how the relations it models organize groups of entities, clustering and visualization can be combined to provide the user with a global overview of the graph, on the form of a projected graph: a simplified graph is visualized in which the nodes correspond to a cluster of nodes in the original graph G (with a size proportional to the number of nodes that are classified inside this cluster) and the edges between two nodes have a width proportional to the number of links between the nodes of G classified in the two corresponding clusters. This approach can be trickier when additional attributes (numerical or factors) describe the nodes of G or when the edges of G are of different types and should be treated separately: the simplified representation should then represent similarities for all sets of information. In this proposal, we present a variant of Self-Organizing Maps (SOM), which is adapted to data described by one or several (dis)similarities or kernels recently published in (Olteanu & Villa-Vialaneix, 2015) and which is able to combine clustering and visualization for this kind of graphs

    Relational data clustering algorithms with biomedical applications

    Get PDF

    Multiple kernel self-organizing maps

    No full text
    International audienceIn a number of real-life applications, the user is interested in analyzing several sources of information together: a graph combined with the additional information known on its nodes, numerical variables measured on individuals and factors describing these individuals... The combination of all sources of information can help him to understand the dataset in its whole better. The present article focuses on such an issue, by using self-organizing maps. The use a kernel version of the algorithm allows us to combine various types of information and automatically tune the data combination. This approach is illustrated on a simulated example

    Optimizing an Organized Modularity Measure for Topographic Graph Clustering: a Deterministic Annealing Approach

    Full text link
    This paper proposes an organized generalization of Newman and Girvan's modularity measure for graph clustering. Optimized via a deterministic annealing scheme, this measure produces topologically ordered graph clusterings that lead to faithful and readable graph representations based on clustering induced graphs. Topographic graph clustering provides an alternative to more classical solutions in which a standard graph clustering method is applied to build a simpler graph that is then represented with a graph layout algorithm. A comparative study on four real world graphs ranging from 34 to 1 133 vertices shows the interest of the proposed approach with respect to classical solutions and to self-organizing maps for graphs

    SOMbrero : Cartes auto-organisatrices stochastiques pour l'intégration de données décrites par des tableaux de dissimilarités

    No full text
    National audienceDans de nombreuses situations réelles, les individus sont décrits par des jeux de données multiples qui ne sont pas nécessairement de simples tableaux numériques mais peuvent être des données complexes (graphes, variables qualitatives, texte...). Un cas typique est celui des graphes étiquetés dans lequel les individus (les sommets du graphe) sont décrits à la fois par leurs relations les uns aux autres mais aussi par des attributs de natures diverses. Dans (Villa-Vialaneix et al, 2013 ; Olteanu et al , 2013), nous avons proposé d'utiliser des cartes auto-organisatrices (Kohonen, 2011) pour combiner classification et visualisation en projetant les individus étudiés sur une grille de faible dimension. Notre approche permet de traiter des données non numériques par le biais de noyaux ou de dissimilarités, et est basée sur une version stochastique de l'apprentissage de cartes auto-organisées. Les différentes dissimilarités sont combinées et la combinaison est optimisée au cours de l'apprentissage de la carte
    • …
    corecore