35 research outputs found
Utiliser SOMbrero pour la classification et la visualisation de graphes
International audienceGraphs have attracted a burst of attention in the last years, with applications to social science, biology, computer science... In the present paper, we illustrate how self-organizing maps (SOM) can be used to enlighten the structure of the graph, performing clustering of the graph together with visualization of a simplified graph. In particular, we present the R package SOMbrero which implements a stochastic version of the so-called relational algorithm: the method is able to process any dissimilarity data and several dissimilarities adapted to graphs are described and compared. The use of the package is illustrated on two real-world datasets: one, included in the package itself, is small enough to allow for a full investigation of the influence of the choice of a dissimilarity to measure the proximity between the vertices on the results. The other example comes from an application in biology and is based on a large bipartite graph of chemical reactions with several thousands vertices.L'analyse de graphes a connu un intérêt croissant dans les dernières années, avec des applications en sciences sociales, biologie, informatique, ... Dans cet article, nous illustrons comment les cartes auto-organisatrices (SOM) peuvent être utilisées pour mettre en lumière la structure d'un graphe en combinant la classification de ses sommets avec une visualisation simplifiée de celui-ci. En particulier, nous présentons le package R SOMbrero dans lequel est implémentée une version stochastique de l'approche dite « relationnelle » de l'algorithme de cartes auto-organisatrices. Cette méthode permet d'utiliser les cartes auto-organisatrices avec des données décrites par des mesures de dissimilarité et nous discutons et comparons ici plusieurs types de dissimilarités adaptées aux graphes. L'utilisation du package est illustrée sur deux jeux de données réelles : le premier, inclus dans le package lui-même, est suffisamment petit pour permettre l'analyse complète de l'influence du choix de la mesure de dissimilarité sur les résultats. Le second exemple provient d'une application en biologie et est basé sur un graphe biparti de grande taille, issu de réactions chimiques et qui contient plusieurs milliers de noeuds
Using SOMbrero for clustering and visualizing graphs
Graphs have attracted a burst of attention in the last years, with applications to social science, biology, computer science ... In the present paper, we illustrate how self-organizing maps (SOM) can be used to enlighten the structure of the graph, performing clustering of the graph together with visualization of a simplified graph. In particular, we present the R package SOMbrero which implements a stochastic version of the so-called relational algorithm: the method is able to process any dissimilarity data and several dissimilarities adapted to graphs are described and compared. The use of the package is illustrated on two real-world datasets: one, included in the package itself, is small enough to allow for a full investigation of the influence of the choice of a dissimilarity to measure the proximity between the vertices on the results. The other example comes from an application in biology and is based on a large bipartite graph of chemical reactions with several thousands vertices.L’analyse de graphes a connu un intérêt croissant dans les dernières années, avec des applications en sciences sociales, biologie, informatique, ... Dans cet article, nous illustrons comment les cartes auto-organisatrices (SOM) peuvent être utilisées pour mettre en lumière la structure d’un graphe en combinant la classification de ses sommets avec une visualisation simplifiée de celui-ci. En particulier, nous présentons le package R SOMbrero dans lequel est implémentée une version stochastique de l’approche dite « relationnelle » de l’algorithme de cartes auto-organisatrices. Cette méthode permet d’utiliser les cartes auto-organisatrices avec des données décrites par des mesures de dissimilarité et nous discutons et comparons ici plusieurs types de dissimilarités adaptées aux graphes. L’utilisation du package est illustrée sur deux jeux de données réelles : le premier, inclus dans le package lui-même, est suffisamment petit pour permettre l’analyse complète de l’influence du choix de la mesure de dissimilarité sur les résultats. Le second exemple provient d’une application en biologie et est basé sur un graphe biparti de grande taille, issu de réactions chimiques et qui contient plusieurs milliers de noeuds
On-line relational and multiple relational SOM
International audienceIn some applications and in order to address real-world situations better, data may be more complex than simple numerical vectors. In some examples, data can be known only through their pairwise dissimilarities or through multiple dissimilarities, each of them describing a particular feature of the data set. Several variants of the Self Organizing Map (SOM) algorithm were introduced to generalize the original algorithm to the framework of dissimilarity data. Whereas median SOM is based on a rough representation of the prototypes, relational SOM allows representing these prototypes by a virtual linear combination of all elements in the data set, referring to a pseudo-euclidean framework. In the present article, an on-line version of relational SOM is introduced and studied. Similarly to the situation in the Euclidean framework, this on-line algorithm provides a better organization and is much less sensible to prototype initialization than standard (batch) relational SOM. In a more general case, this stochastic version allows us to integrate an additional stochastic gradient descent step in the algorithm which can tune the respective weights of several dissimilarities in an optimal way: the resulting \emph{multiple relational SOM} thus has the ability to integrate several sources of data of different types, or to make a consensus between several dissimilarities describing the same data. The algorithms introduced in this manuscript are tested on several data sets, including categorical data and graphs. On-line relational SOM is currently available in the R package SOMbrero that can be downloaded at http://sombrero.r-forge.r-project.org or directly tested on its Web User Interface at http://shiny.nathalievilla.org/sombrero
Mapas autoorganizados (Self-organizing maps)
En el análisis de datos, tratar conjuntos de alta dimensión puede ser
complicado ya que a veces es difícil encontrar estructuras o representar esos
datos. Los mapas autoorganizados son un tipo de red neuronal efectivo pero
sencillo basado en aprendizaje no supervisado y competitivo que permite
representar estos datos en sencillos mapas a través de proyecciones no lineales
a espacios de baja dimensión, generalmente 2D, preservando la topología
del espacio original. Este método, que fue desarrollado por Teuvo Kohonen,
ofrece por un lado la representación visual de los datos manteniendo las
similitudes entre ellos, y, por otro, la agrupación de observaciones similares
en el espacio original.
Este trabajo desarrolla los aspectos teóricos m as importantes de los mapas
autoorganizados, se explica su funcionamiento y se estudian diferentes
posibilidades para su implementación. Finalmente, se presentan dos aplicaciones
de esta técnica, una de ellas tratando con datos de la pandemia del
COVID-19.In data analysis, dealing with high dimensional datasets can be complicated
as it is sometimes challenging to nd structures or visualize them.
Self-organizing maps are a simple yet e ective type of neural network based
on unsupervised and competitive learning that allows representing these data
in simple maps through nonlinear projections to low dimensional spaces, usually
2D, preserving the topology of the original space. This method, which
was developed by Teuvo Kohonen, o ers, on the one hand a visual representation
of the data that preserves similarities among them, and, on the other
hand, clusters of similar observations in the original space.
This work develops the most important theoretical aspects of self-organizing
maps, explains how they work and studies the di erent posibilities for its implementation.
Finally, two applications od the technique are presented, one
of them dealing with the COVID-19 pandemia data.Grado en Estadístic
Proceedings of Monterey Workshop 2001 Engineering Automation for Sofware Intensive System Integration
The 2001 Monterey Workshop on Engineering Automation for Software Intensive System Integration was sponsored by the Office of Naval Research, Air Force Office of Scientific Research, Army Research Office and the Defense Advance Research Projects Agency. It is our pleasure to thank the workshop advisory and sponsors for their vision of a principled engineering solution for software and for their many-year tireless effort in supporting a series of workshops to bring everyone together.This workshop is the 8 in a series of International workshops. The workshop was held in Monterey Beach Hotel, Monterey, California during June 18-22, 2001. The general theme of the workshop has been to present and discuss research works that aims at increasing the practical impact of formal methods for software and systems engineering. The particular focus of this workshop was "Engineering Automation for Software Intensive System Integration". Previous workshops have been focused on issues including, "Real-time & Concurrent Systems", "Software Merging and Slicing", "Software Evolution", "Software Architecture", "Requirements Targeting Software" and "Modeling Software System Structures in a fastly moving scenario".Office of Naval ResearchAir Force Office of Scientific Research Army Research OfficeDefense Advanced Research Projects AgencyApproved for public release, distribution unlimite