10,812 research outputs found

    Self-adaptive GA, quantitative semantic similarity measures and ontology-based text clustering

    Get PDF
    As the common clustering algorithms use vector space model (VSM) to represent document, the conceptual relationships between related terms which do not co-occur literally are ignored. A genetic algorithm-based clustering technique, named GA clustering, in conjunction with ontology is proposed in this article to overcome this problem. In general, the ontology measures can be partitioned into two categories: thesaurus-based methods and corpus-based methods. We take advantage of the hierarchical structure and the broad coverage taxonomy of Wordnet as the thesaurus-based ontology. However, the corpus-based method is rather complicated to handle in practical application. We propose a transformed latent semantic analysis (LSA) model as the corpus-based method in this paper. Moreover, two hybrid strategies, the combinations of the various similarity measures, are implemented in the clustering experiments. The results show that our GA clustering algorithm, in conjunction with the thesaurus-based and the LSA-based method, apparently outperforms that with other similarity measures. Moreover, the superiority of the GA clustering algorithm proposed over the commonly used k-means algorithm and the standard GA is demonstrated by the improvements of the clustering performance

    Multimodel Approaches for Plasma Glucose Estimation in Continuous Glucose Monitoring. Development of New Calibration Algorithms

    Full text link
    ABSTRACT Diabetes Mellitus (DM) embraces a group of metabolic diseases which main characteristic is the presence of high glucose levels in blood. It is one of the diseases with major social and health impact, both for its prevalence and also the consequences of the chronic complications that it implies. One of the research lines to improve the quality of life of people with diabetes is of technical focus. It involves several lines of research, including the development and improvement of devices to estimate "online" plasma glucose: continuous glucose monitoring systems (CGMS), both invasive and non-invasive. These devices estimate plasma glucose from sensor measurements from compartments alternative to blood. Current commercially available CGMS are minimally invasive and offer an estimation of plasma glucose from measurements in the interstitial fluid CGMS is a key component of the technical approach to build the artificial pancreas, aiming at closing the loop in combination with an insulin pump. Yet, the accuracy of current CGMS is still poor and it may partly depend on low performance of the implemented Calibration Algorithm (CA). In addition, the sensor-to-patient sensitivity is different between patients and also for the same patient in time. It is clear, then, that the development of new efficient calibration algorithms for CGMS is an interesting and challenging problem. The indirect measurement of plasma glucose through interstitial glucose is a main confounder of CGMS accuracy. Many components take part in the glucose transport dynamics. Indeed, physiology might suggest the existence of different local behaviors in the glucose transport process. For this reason, local modeling techniques may be the best option for the structure of the desired CA. Thus, similar input samples are represented by the same local model. The integration of all of them considering the input regions where they are valid is the final model of the whole data set. Clustering is tBarceló Rico, F. (2012). Multimodel Approaches for Plasma Glucose Estimation in Continuous Glucose Monitoring. Development of New Calibration Algorithms [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17173Palanci

    Preliminary evidence on machine learning approaches for clusterizing students’ cognitive profile

    Get PDF
    Assessing the cognitive abilities of students in academic contexts can provide valuable insights for teachers to identify their cognitive profile and create personalized teaching strategies. While numerous studies have demonstrated promising outcomes in clustering students based on their cognitive profiles, effective comparisons between various clustering methods are lacking in the current literature. In this study, we aim to compare the effectiveness of two clustering techniques to group students based on their cognitive abilities including general intelligence, attention, visual perception, working memory, and phonological awareness. 292 students, aged 11–15 years, participated in the study. A two-level approach based on the joint use of Kohonen's Self-Organizing Map (SOMs) and k-means clustering algorithm was compared with an approach based on the k-means clustering algorithm only. The resulting profiles were then predicted via AdaBoost and ANN supervised algorithms. The results showed that the two-level approach provides the best solution for this problem while the ANN algorithm was the winner in the classification problem. These results laying the foundations for developing a useful instrument for predicting the students’ cognitive profile

    Comparison of Direct Multiobjective Optimization Methods for the Design of Electric Vehicles

    Get PDF
    "System design oriented methodologies" are discussed in this paper through the comparison of multiobjective optimization methods applied to heterogeneous devices in electrical engineering. Avoiding criteria function derivatives, direct optimization algorithms are used. In particular, deterministic geometric methods such as the Hooke & Jeeves heuristic approach are compared with stochastic evolutionary algorithms (Pareto genetic algorithms). Different issues relative to convergence rapidity and robustness on mixed (continuous/discrete), constrained and multiobjective problems are discussed. A typical electrical engineering heterogeneous and multidisciplinary system is considered as a case study: the motor drive of an electric vehicle. Some results emphasize the capacity of each approach to facilitate system analysis and particularly to display couplings between optimization parameters, constraints, objectives and the driving mission

    Dynamic Clustering of Histogram Data Based on Adaptive Squared Wasserstein Distances

    Full text link
    This paper deals with clustering methods based on adaptive distances for histogram data using a dynamic clustering algorithm. Histogram data describes individuals in terms of empirical distributions. These kind of data can be considered as complex descriptions of phenomena observed on complex objects: images, groups of individuals, spatial or temporal variant data, results of queries, environmental data, and so on. The Wasserstein distance is used to compare two histograms. The Wasserstein distance between histograms is constituted by two components: the first based on the means, and the second, to internal dispersions (standard deviation, skewness, kurtosis, and so on) of the histograms. To cluster sets of histogram data, we propose to use Dynamic Clustering Algorithm, (based on adaptive squared Wasserstein distances) that is a k-means-like algorithm for clustering a set of individuals into KK classes that are apriori fixed. The main aim of this research is to provide a tool for clustering histograms, emphasizing the different contributions of the histogram variables, and their components, to the definition of the clusters. We demonstrate that this can be achieved using adaptive distances. Two kind of adaptive distances are considered: the first takes into account the variability of each component of each descriptor for the whole set of individuals; the second takes into account the variability of each component of each descriptor in each cluster. We furnish interpretative tools of the obtained partition based on an extension of the classical measures (indexes) to the use of adaptive distances in the clustering criterion function. Applications on synthetic and real-world data corroborate the proposed procedure

    Multi-objective evolutionary fuzzy clustering for high-dimensional problems

    Get PDF
    This paper deals with the application of unsupervised fuzzy clustering to high dimensional data. Two problems are addressed: groups (clusters) number discovery and feature selection without performance losses. In particular we analyze the potential of a genetic fuzzy system, that is the integration of a multi-objective evolutionary algorithm with a fuzzy clustering algorithm. The main characteristic of the integrated approach is the ability to handle the two problems at the same time, suggesting a Pareto set of trade-off solutions which could have a better chance of matching the real needs. We exhibit the high quality clustering and features selection results by applying our approach to a real-world data set
    • …
    corecore