128,592 research outputs found

    Experiments in Clustering Homogeneous XML Documents to Validate an Existing Typology

    Get PDF
    This paper presents some experiments in clustering homogeneous XMLdocuments to validate an existing classification or more generally anorganisational structure. Our approach integrates techniques for extracting knowledge from documents with unsupervised classification (clustering) of documents. We focus on the feature selection used for representing documents and its impact on the emerging classification. We mix the selection of structured features with fine textual selection based on syntactic characteristics.We illustrate and evaluate this approach with a collection of Inria activity reports for the year 2003. The objective is to cluster projects into larger groups (Themes), based on the keywords or different chapters of these activity reports. We then compare the results of clustering using different feature selections, with the official theme structure used by Inria.Comment: (postprint); This version corrects a couple of errors in authors' names in the bibliograph

    Identifying smart design attributes for Industry 4.0 customization using a clustering Genetic Algorithm

    Get PDF
    Industry 4.0 aims at achieving mass customization at a mass production cost. A key component to realizing this is accurate prediction of customer needs and wants, which is however a challenging issue due to the lack of smart analytics tools. This paper investigates this issue in depth and then develops a predictive analytic framework for integrating cloud computing, big data analysis, business informatics, communication technologies, and digital industrial production systems. Computational intelligence in the form of a cluster k-means approach is used to manage relevant big data for feeding potential customer needs and wants to smart designs for targeted productivity and customized mass production. The identification of patterns from big data is achieved with cluster k-means and with the selection of optimal attributes using genetic algorithms. A car customization case study shows how it may be applied and where to assign new clusters with growing knowledge of customer needs and wants. This approach offer a number of features suitable to smart design in realizing Industry 4.0

    Multicriteria mapping manual: version 1.0

    Get PDF
    This Manual offers basic advice on how to do multicriteria mapping (MCM). It suggests how to: go about designing and building a typical MCM project; engage with participants and analyse results – and get the most out of the online MCM tool. Key terms are shown in bold italics and defined and explained in a final Annex. The online MCM software tool provides its own operational help. So this Manual is more focused on the general approach. There are no rigid rules. MCM is structured, but very flexible. It allows many more detailed features than can be covered here. MCM users are encouraged to think for themselves and be responsible and creative

    Ant colony optimization approach for the capacitated vehicle routing problem with simultaneous delivery and pick-up

    Get PDF
    We propose an Ant Colony Optimization (ACO) algorithm to the NPhard Vehicle Routing Problem with Simultaneous Delivery and Pick-up (VRPSDP). In VRPSDP, commodities are delivered to customers from a single depot utilizing a fleet of identical vehicles and empty packages are collected from the customers and transported back to the depot. The objective is to minimize the total distance traveled. The algorithm is tested with the well-known benchmark problems from the literature. The experimental study indicates that our approach produces comparable results to those of the benchmark problems in the literature

    Bag-Level Aggregation for Multiple Instance Active Learning in Instance Classification Problems

    Full text link
    A growing number of applications, e.g. video surveillance and medical image analysis, require training recognition systems from large amounts of weakly annotated data while some targeted interactions with a domain expert are allowed to improve the training process. In such cases, active learning (AL) can reduce labeling costs for training a classifier by querying the expert to provide the labels of most informative instances. This paper focuses on AL methods for instance classification problems in multiple instance learning (MIL), where data is arranged into sets, called bags, that are weakly labeled. Most AL methods focus on single instance learning problems. These methods are not suitable for MIL problems because they cannot account for the bag structure of data. In this paper, new methods for bag-level aggregation of instance informativeness are proposed for multiple instance active learning (MIAL). The \textit{aggregated informativeness} method identifies the most informative instances based on classifier uncertainty, and queries bags incorporating the most information. The other proposed method, called \textit{cluster-based aggregative sampling}, clusters data hierarchically in the instance space. The informativeness of instances is assessed by considering bag labels, inferred instance labels, and the proportion of labels that remain to be discovered in clusters. Both proposed methods significantly outperform reference methods in extensive experiments using benchmark data from several application domains. Results indicate that using an appropriate strategy to address MIAL problems yields a significant reduction in the number of queries needed to achieve the same level of performance as single instance AL methods
    • …
    corecore