6,119 research outputs found

    Fuzzy clustering of univariate and multivariate time series by genetic multiobjective optimization

    Get PDF
    Given a set of time series, it is of interest to discover subsets that share similar properties. For instance, this may be useful for identifying and estimating a single model that may fit conveniently several time series, instead of performing the usual identification and estimation steps for each one. On the other hand time series in the same cluster are related with respect to the measures assumed for cluster analysis and are suitable for building multivariate time series models. Though many approaches to clustering time series exist, in this view the most effective method seems to have to rely on choosing some features relevant for the problem at hand and seeking for clusters according to their measurements, for instance the autoregressive coe±cients, spectral measures or the eigenvectors of the covariance matrix. Some new indexes based on goodnessof-fit criteria will be proposed in this paper for fuzzy clustering of multivariate time series. A general purpose fuzzy clustering algorithm may be used to estimate the proper cluster structure according to some internal criteria of cluster validity. Such indexes are known to measure actually definite often conflicting cluster properties, compactness or connectedness, for instance, or distribution, orientation, size and shape. It is argued that the multiobjective optimization supported by genetic algorithms is a most effective choice in such a di±cult context. In this paper we use the Xie-Beni index and the C-means functional as objective functions to evaluate the cluster validity in a multiobjective optimization framework. The concept of Pareto optimality in multiobjective genetic algorithms is used to evolve a set of potential solutions towards a set of optimal non-dominated solutions. Genetic algorithms are well suited for implementing di±cult optimization problems where objective functions do not usually have good mathematical properties such as continuity, differentiability or convexity. In addition the genetic algorithms, as population based methods, may yield a complete Pareto front at each step of the iterative evolutionary procedure. The method is illustrated by means of a set of real data and an artificial multivariate time series data set.Fuzzy clustering, Internal criteria of cluster validity, Genetic algorithms, Multiobjective optimization, Time series, Pareto optimality

    A Multiobjective Evolutionary Conceptual Clustering Methodology for Gene Annotation Within Structural Databases: A Case of Study on the Gene Ontology Database

    Get PDF
    Current tools and techniques devoted to examine the content of large databases are often hampered by their inability to support searches based on criteria that are meaningful to their users. These shortcomings are particularly evident in data banks storing representations of structural data such as biological networks. Conceptual clustering techniques have demonstrated to be appropriate for uncovering relationships between features that characterize objects in structural data. However, typical con ceptual clustering approaches normally recover the most obvious relations, but fail to discover the lessfrequent but more informative underlying data associations. The combination of evolutionary algorithms with multiobjective and multimodal optimization techniques constitutes a suitable tool for solving this problem. We propose a novel conceptual clustering methodology termed evolutionary multiobjective conceptual clustering (EMO-CC), re lying on the NSGA-II multiobjective (MO) genetic algorithm. We apply this methodology to identify conceptual models in struc tural databases generated from gene ontologies. These models can explain and predict phenotypes in the immunoinflammatory response problem, similar to those provided by gene expression or other genetic markers. The analysis of these results reveals that our approach uncovers cohesive clusters, even those comprising a small number of observations explained by several features, which allows describing objects and their interactions from different perspectives and at different levels of detail.Ministerio de Ciencia y Tecnología TIC-2003-00877Ministerio de Ciencia y Tecnología BIO2004-0270EMinisterio de Ciencia y Tecnología TIN2006-1287

    An Empirical Study of Cohesion and Coupling: Balancing Optimisation and Disruption

    Get PDF
    Search based software engineering has been extensively applied to the problem of finding improved modular structures that maximise cohesion and minimise coupling. However, there has, hitherto, been no longitudinal study of developers’ implementations, over a series of sequential releases. Moreover, results validating whether developers respect the fitness functions are scarce, and the potentially disruptive effect of search-based remodularisation is usually overlooked. We present an empirical study of 233 sequential releases of 10 different systems; the largest empirical study reported in the literature so far, and the first longitudinal study. Our results provide evidence that developers do, indeed, respect the fitness functions used to optimise cohesion/coupling (they are statistically significantly better than arbitrary choices with p << 0.01), yet they also leave considerable room for further improvement (cohesion/coupling can be improved by 25% on average). However, we also report that optimising the structure is highly disruptive (on average more than 57% of the structure must change), while our results reveal that developers tend to avoid such disruption. Therefore, we introduce and evaluate a multi-objective evolutionary approach that minimises disruption while maximising cohesion/coupling improvement. This allows developers to balance reticence to disrupt existing modular structure, against their competing need to improve cohesion and coupling. The multi-objective approach is able to find modular structures that improve the cohesion of developers’ implementations by 22.52%, while causing an acceptably low level of disruption (within that already tolerated by developers)

    Comparison of Direct Multiobjective Optimization Methods for the Design of Electric Vehicles

    Get PDF
    "System design oriented methodologies" are discussed in this paper through the comparison of multiobjective optimization methods applied to heterogeneous devices in electrical engineering. Avoiding criteria function derivatives, direct optimization algorithms are used. In particular, deterministic geometric methods such as the Hooke & Jeeves heuristic approach are compared with stochastic evolutionary algorithms (Pareto genetic algorithms). Different issues relative to convergence rapidity and robustness on mixed (continuous/discrete), constrained and multiobjective problems are discussed. A typical electrical engineering heterogeneous and multidisciplinary system is considered as a case study: the motor drive of an electric vehicle. Some results emphasize the capacity of each approach to facilitate system analysis and particularly to display couplings between optimization parameters, constraints, objectives and the driving mission

    A hierarchical Mamdani-type fuzzy modelling approach with new training data selection and multi-objective optimisation mechanisms: A special application for the prediction of mechanical properties of alloy steels

    Get PDF
    In this paper, a systematic data-driven fuzzy modelling methodology is proposed, which allows to construct Mamdani fuzzy models considering both accuracy (precision) and transparency (interpretability) of fuzzy systems. The new methodology employs a fast hierarchical clustering algorithm to generate an initial fuzzy model efficiently; a training data selection mechanism is developed to identify appropriate and efficient data as learning samples; a high-performance Particle Swarm Optimisation (PSO) based multi-objective optimisation mechanism is developed to further improve the fuzzy model in terms of both the structure and the parameters; and a new tolerance analysis method is proposed to derive the confidence bands relating to the final elicited models. This proposed modelling approach is evaluated using two benchmark problems and is shown to outperform other modelling approaches. Furthermore, the proposed approach is successfully applied to complex high-dimensional modelling problems for manufacturing of alloy steels, using ‘real’ industrial data. These problems concern the prediction of the mechanical properties of alloy steels by correlating them with the heat treatment process conditions as well as the weight percentages of the chemical compositions

    Recombination and Self-Adaptation in Multi-objective Genetic Algorithms

    Get PDF
    This paper investigates the influence of recombination and self-adaptation in real-encoded Multi-Objective Genetic Algorithms (MOGAs). NSGA-II and SPEA2 are used as example to characterize the efficiency of MOGAs in relation to various recombination operators. The blend crossover, the simulated binary crossover and the breeder genetic crossover are compared for both MOGAs on multi-objective problems of the literature. Finally, a self-adaptive recombination scheme is proposed to improve the robustness of MOGAs

    Improved sampling of the pareto-front in multiobjective genetic optimizations by steady-state evolution: a Pareto converging genetic algorithm

    Get PDF
    Previous work on multiobjective genetic algorithms has been focused on preventing genetic drift and the issue of convergence has been given little attention. In this paper, we present a simple steady-state strategy, Pareto Converging Genetic Algorithm (PCGA), which naturally samples the solution space and ensures population advancement towards the Pareto-front. PCGA eliminates the need for sharing/niching and thus minimizes heuristically chosen parameters and procedures. A systematic approach based on histograms of rank is introduced for assessing convergence to the Pareto-front, which, by definition, is unknown in most real search problems. We argue that there is always a certain inheritance of genetic material belonging to a population, and there is unlikely to be any significant gain beyond some point; a stopping criterion where terminating the computation is suggested. For further encouraging diversity and competition, a nonmigrating island model may optionally be used; this approach is particularly suited to many difficult (real-world) problems, which have a tendency to get stuck at (unknown) local minima. Results on three benchmark problems are presented and compared with those of earlier approaches. PCGA is found to produce diverse sampling of the Pareto-front without niching and with significantly less computational effort
    • …
    corecore