13,865 research outputs found

    On-line relational and multiple relational SOM

    No full text
    International audienceIn some applications and in order to address real-world situations better, data may be more complex than simple numerical vectors. In some examples, data can be known only through their pairwise dissimilarities or through multiple dissimilarities, each of them describing a particular feature of the data set. Several variants of the Self Organizing Map (SOM) algorithm were introduced to generalize the original algorithm to the framework of dissimilarity data. Whereas median SOM is based on a rough representation of the prototypes, relational SOM allows representing these prototypes by a virtual linear combination of all elements in the data set, referring to a pseudo-euclidean framework. In the present article, an on-line version of relational SOM is introduced and studied. Similarly to the situation in the Euclidean framework, this on-line algorithm provides a better organization and is much less sensible to prototype initialization than standard (batch) relational SOM. In a more general case, this stochastic version allows us to integrate an additional stochastic gradient descent step in the algorithm which can tune the respective weights of several dissimilarities in an optimal way: the resulting \emph{multiple relational SOM} thus has the ability to integrate several sources of data of different types, or to make a consensus between several dissimilarities describing the same data. The algorithms introduced in this manuscript are tested on several data sets, including categorical data and graphs. On-line relational SOM is currently available in the R package SOMbrero that can be downloaded at http://sombrero.r-forge.r-project.org or directly tested on its Web User Interface at http://shiny.nathalievilla.org/sombrero

    Methods of representing the structure of complex industrial products on computer files, to facilitate planning, costing and related management tasks : a thesis presented in fulfilment of the requirements for the degree of Master of Technology in Manufacturing and Industrial Technology at Massey University

    Get PDF
    When the original concepts for the computerisation of product structures were developed in the late 1960's the available computer power was very limited. A modularisation technique was developed to address the situation in which a number of similar products were being manufactured. This technique tried to rationalise these products into family groups. Each member of the family differed from the others due to the possession of different features or options. However there was also a high degree of commonality to give the product membership of the family. Modularisation involved the identification of the options and features providing the variability. Those parts remaining tended to be common to all members of the family and became known as the common parts. Separate Bills of Material (BOMs) were set up for each of the identified options or features. Another BOM was set up for the common parts. The simple combination of the required options and/or features BOMs with the common parts BOM specified a product. Computer storage requirements and redundancy were reduced to a minimum. The Materials Requirements Planning (MRP) system could manipulate these option and feature BOMs to over plan product variability without over planning the parts common to all members. The modularisation philosophy had wide acceptance and is the foundation of MRP training. Modularisation, developed for MRP, is generally parts orientated. An unfortunate side effect tends to be the loss of product structure information. Most commercial software would list 6 resistors, Part No. 123, in the common parts BOM without concern as to where the resistors are fitted. This loss of product structure information can hide the fact that two products using these 6 resistors 'in common' are in fact different as they do not use the resistors in the same 6 places. Additional information must be consulted to enable the correct assembly of the 'common' portion of these products. The electronics industry is especially affected by this situation. This industry has changed considerably since the late 1960's. Product variability can be very high. Changes and enhancements are a constant factor in products having a relatively short life span. The modularisation technique does not have a good mechanism for the situation where an option itself has options or features. This situation can exist down a number of layers of the family tree structure of an electronics product. Maintenance of these BOMs is difficult. Where there are options within options the designers and production staff need to know the inter-relationship of parts between options to ensure accuracy, compatibility and plan assembly functions. The advent of computerised spreadsheets has made the maintenance of this type of product structure information easier. This matrix is another separate document which must be maintained and cross checked. It will inevitably differ from the BOMs periodically. This thesis develops a product structure Relational BOM based on the matrix for the family of products. The processing power of the 1990's computer is fully utilised to derive the common parts for any or all of the selected products of the family. All product structure information is retained and the inter-relationship of parts is highly visible. The physical maintenance of the BOMs is simple. The BOM serves all purposes without the need for supplementary information. It is fully integrated into the Sales Order Entry , MRP, Costing, Engineering Design and Computer Aided Manufacturing (CAM) systems. This technique has been proven by being the only system used in one Electronics Design and Manufacturing organisation for over 1 year without any major problems. As described in Section 1.6 user satisfaction has been high. The response of the users to the suggestion 'lets buy an "off the shelf" package' is very negative, as it would not incorporate this BOM system. Users have expressed the opinion that EXICOM could not continue, with present staffing levels, using the traditional BOM structure

    Neural Networks for Complex Data

    Full text link
    Artificial neural networks are simple and efficient machine learning tools. Defined originally in the traditional setting of simple vector data, neural network models have evolved to address more and more difficulties of complex real world problems, ranging from time evolving data to sophisticated data structures such as graphs and functions. This paper summarizes advances on those themes from the last decade, with a focus on results obtained by members of the SAMM team of Universit\'e Paris

    On-line relational SOM for dissimilarity data

    No full text
    International audienceIn some applications and in order to address real world situations better, data may be more complex than simple vectors. In some examples, they can be known through their pairwise dissimilarities only. Several variants of the Self Organizing Map algorithm were introduced to generalize the original algorithm to this framework. Whereas median SOM is based on a rough representation of the prototypes, relational SOM allows representing these prototypes by a virtual combination of all elements in the data set. However, this latter approach suffers from two main drawbacks. First, its complexity can be large. Second, only a batch version of this algorithm has been studied so far and it often provides results having a bad topographic organization. In this article, an on-line version of relational SOM is described and justified. The algorithm is tested on several datasets, including categorical data and graphs, and compared with the batch version and with other SOM algorithms for non vector data

    Curbing domestic violence: instantiating C-K theory with formal concept analysis and emergent self organizing maps.

    Get PDF
    In this paper we propose a human-centered process for knowledge discovery from unstructured text that makes use of Formal Concept Analysis and Emergent Self Organizing Maps. The knowledge discovery process is conceptualized and interpreted as successive iterations through the Concept-Knowledge (C-K) theory design square. To illustrate its effectiveness, we report on a real-life case study of using the process at the Amsterdam-Amstelland police in the Netherlands aimed at distilling concepts to identify domestic violence from the unstructured text in actual police reports. The case study allows us to show how the process was not only able to uncover the nature of a phenomenon such as domestic violence, but also enabled analysts to identify many types of anomalies in the practice of policing. We will illustrate how the insights obtained from this exercise resulted in major improvements in the management of domestic violence cases.Formal concept analysis; Emergent self organizing map; C-K theory; Text mining; Actionable knowledge discovery; Domestic violence;

    Utiliser SOMbrero pour la classification et la visualisation de graphes

    No full text
    International audienceGraphs have attracted a burst of attention in the last years, with applications to social science, biology, computer science... In the present paper, we illustrate how self-organizing maps (SOM) can be used to enlighten the structure of the graph, performing clustering of the graph together with visualization of a simplified graph. In particular, we present the R package SOMbrero which implements a stochastic version of the so-called relational algorithm: the method is able to process any dissimilarity data and several dissimilarities adapted to graphs are described and compared. The use of the package is illustrated on two real-world datasets: one, included in the package itself, is small enough to allow for a full investigation of the influence of the choice of a dissimilarity to measure the proximity between the vertices on the results. The other example comes from an application in biology and is based on a large bipartite graph of chemical reactions with several thousands vertices.L'analyse de graphes a connu un intérêt croissant dans les dernières années, avec des applications en sciences sociales, biologie, informatique, ... Dans cet article, nous illustrons comment les cartes auto-organisatrices (SOM) peuvent être utilisées pour mettre en lumière la structure d'un graphe en combinant la classification de ses sommets avec une visualisation simplifiée de celui-ci. En particulier, nous présentons le package R SOMbrero dans lequel est implémentée une version stochastique de l'approche dite « relationnelle » de l'algorithme de cartes auto-organisatrices. Cette méthode permet d'utiliser les cartes auto-organisatrices avec des données décrites par des mesures de dissimilarité et nous discutons et comparons ici plusieurs types de dissimilarités adaptées aux graphes. L'utilisation du package est illustrée sur deux jeux de données réelles : le premier, inclus dans le package lui-même, est suffisamment petit pour permettre l'analyse complète de l'influence du choix de la mesure de dissimilarité sur les résultats. Le second exemple provient d'une application en biologie et est basé sur un graphe biparti de grande taille, issu de réactions chimiques et qui contient plusieurs milliers de noeuds

    Which dissimilarity is to be used when extracting typologies in sequence analysis? A comparative study

    No full text
    International audienceOriginally developed in bioinformatics, sequence analysis is being increasingly used in social sciences for the study of life-course processes. The methodology generally employed consists in computing dissimilarities between the trajectories and, if typologies are sought, in clustering the trajectories according to their similarities or dissemblances. The choice of an appropriate dissimilarity measure is a major issue when dealing with sequence analysis for life sequences. Several dissimilarities are available in the literature, but neither of them succeeds to become indisputable. In this paper, instead of deciding upon one dissimilarity measure, we propose to use an optimal convex combination of different dissimilarities. The optimality is automatically determined by the clustering procedure and is defined with respect to the within-class variance
    corecore