11,874 research outputs found

    Relational visual cluster validity

    Get PDF
    The assessment of cluster validity plays a very important role in cluster analysis. Most commonly used cluster validity methods are based on statistical hypothesis testing or finding the best clustering scheme by computing a number of different cluster validity indices. A number of visual methods of cluster validity have been produced to display directly the validity of clusters by mapping data into two- or three-dimensional space. However, these methods may lose too much information to correctly estimate the results of clustering algorithms. Although the visual cluster validity (VCV) method of Hathaway and Bezdek can successfully solve this problem, it can only be applied for object data, i.e. feature measurements. There are very few validity methods that can be used to analyze the validity of data where only a similarity or dissimilarity relation exists – relational data. To tackle this problem, this paper presents a relational visual cluster validity (RVCV) method to assess the validity of clustering relational data. This is done by combining the results of the non-Euclidean relational fuzzy c-means (NERFCM) algorithm with a modification of the VCV method to produce a visual representation of cluster validity. RVCV can cluster complete and incomplete relational data and adds to the visual cluster validity theory. Numeric examples using synthetic and real data are presente

    Machine-Part cell formation through visual decipherable clustering of Self Organizing Map

    Full text link
    Machine-part cell formation is used in cellular manufacturing in order to process a large variety, quality, lower work in process levels, reducing manufacturing lead-time and customer response time while retaining flexibility for new products. This paper presents a new and novel approach for obtaining machine cells and part families. In the cellular manufacturing the fundamental problem is the formation of part families and machine cells. The present paper deals with the Self Organising Map (SOM) method an unsupervised learning algorithm in Artificial Intelligence, and has been used as a visually decipherable clustering tool of machine-part cell formation. The objective of the paper is to cluster the binary machine-part matrix through visually decipherable cluster of SOM color-coding and labelling via the SOM map nodes in such a way that the part families are processed in that machine cells. The Umatrix, component plane, principal component projection, scatter plot and histogram of SOM have been reported in the present work for the successful visualization of the machine-part cell formation. Computational result with the proposed algorithm on a set of group technology problems available in the literature is also presented. The proposed SOM approach produced solutions with a grouping efficacy that is at least as good as any results earlier reported in the literature and improved the grouping efficacy for 70% of the problems and found immensely useful to both industry practitioners and researchers.Comment: 18 pages,3 table, 4 figure

    Intelligent computational sketching support for conceptual design

    Get PDF
    Sketches, with their flexibility and suggestiveness, are in many ways ideal for expressing emerging design concepts. This can be seen from the fact that the process of representing early designs by free-hand drawings was used as far back as in the early 15th century [1]. On the other hand, CAD systems have become widely accepted as an essential design tool in recent years, not least because they provide a base on which design analysis can be carried out. Efficient transfer of sketches into a CAD representation, therefore, is a powerful addition to the designers' armoury.It has been pointed out by many that a pen-on-paper system is the best tool for sketching. One of the crucial requirements of a computer aided sketching system is its ability to recognise and interpret the elements of sketches. 'Sketch recognition', as it has come to be known, has been widely studied by people working in such fields: as artificial intelligence to human-computer interaction and robotic vision. Despite the continuing efforts to solve the problem of appropriate conceptual design modelling, it is difficult to achieve completely accurate recognition of sketches because usually sketches implicate vague information, and the idiosyncratic expression and understanding differ from each designer

    Radio Galaxy Zoo: Knowledge Transfer Using Rotationally Invariant Self-Organising Maps

    Full text link
    With the advent of large scale surveys the manual analysis and classification of individual radio source morphologies is rendered impossible as existing approaches do not scale. The analysis of complex morphological features in the spatial domain is a particularly important task. Here we discuss the challenges of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project and introduce a proper transfer mechanism via quantile random forest regression. By using parallelized rotation and flipping invariant Kohonen-maps, image cubes of Radio Galaxy Zoo selected galaxies formed from the FIRST radio continuum and WISE infrared all sky surveys are first projected down to a two-dimensional embedding in an unsupervised way. This embedding can be seen as a discretised space of shapes with the coordinates reflecting morphological features as expressed by the automatically derived prototypes. We find that these prototypes have reconstructed physically meaningful processes across two channel images at radio and infrared wavelengths in an unsupervised manner. In the second step, images are compared with those prototypes to create a heat-map, which is the morphological fingerprint of each object and the basis for transferring the user generated labels. These heat-maps have reduced the feature space by a factor of 248 and are able to be used as the basis for subsequent ML methods. Using an ensemble of decision trees we achieve upwards of 85.7% and 80.7% accuracy when predicting the number of components and peaks in an image, respectively, using these heat-maps. We also question the currently used discrete classification schema and introduce a continuous scale that better reflects the uncertainty in transition between two classes, caused by sensitivity and resolution limits

    Local feature weighting in nearest prototype classification

    Get PDF
    The distance metric is the corner stone of nearest neighbor (NN)-based methods, and therefore, of nearest prototype (NP) algorithms. That is because they classify depending on the similarity of the data. When the data is characterized by a set of features which may contribute to the classification task in different levels, feature weighting or selection is required, sometimes in a local sense. However, local weighting is typically restricted to NN approaches. In this paper, we introduce local feature weighting (LFW) in NP classification. LFW provides each prototype its own weight vector, opposite to typical global weighting methods found in the NP literature, where all the prototypes share the same one. Providing each prototype its own weight vector has a novel effect in the borders of the Voronoi regions generated: They become nonlinear. We have integrated LFW with a previously developed evolutionary nearest prototype classifier (ENPC). The experiments performed both in artificial and real data sets demonstrate that the resulting algorithm that we call LFW in nearest prototype classification (LFW-NPC) avoids overfitting on training data in domains where the features may have different contribution to the classification task in different areas of the feature space. This generalization capability is also reflected in automatically obtaining an accurate and reduced set of prototypes.Publicad

    A systematic review of data quality issues in knowledge discovery tasks

    Get PDF
    Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust
    corecore