50,605 research outputs found

    Self-tune linear adaptive-genetic algorithm for feature selection

    Get PDF
    Genetic algorithm (GA) is an established machine learning technique used for heuristic optimisation purposes. However, this natural selection-based technique is prone to premature convergence, especially of the local optimum event. The presence of stagnant performance is due to low population diversity and fixed genetic operator setting. Therefore, an adaptive algorithm, the Self-Tune Linear Adaptive-GA (STLA-GA), is presented in order to avoid suboptimal solutions in feature selection case studies. STLA-GA performs parameter tuning for mutation probability rate, population size, maximum generation number and novel convergence threshold while simultaneously updating the stopping criteria by adopting an exploration-exploitation cycle. The exploration-exploitation cycle embedded in STLA-GA is a function of the latest classifier performance. Compared to standard feature selection practice, the proposed STLA-GA delivers multi-fold benefits, including overcoming local optimum solutions, yielding higher feature subset reduction rates, removing manual parameter tuning, eliminating premature convergence and preventing excessive computational cost, which is due to unstable parameter tuning feedback

    A lexicographic multi-objective genetic algorithm for multi-label correlation-based feature selection

    Get PDF
    This paper proposes a new Lexicographic multi-objective Genetic Algorithm for Multi-Label Correlation-based Feature Selection (LexGA-ML-CFS), which is an extension of the previous single-objective Genetic Algorithm for Multi-label Correlation-based Feature Selection (GA-ML-CFS). This extension uses a LexGA as a global search method for generating candidate feature subsets. In our experiments, we compare the results obtained by LexGA-ML-CFS with the results obtained by the original hill climbing-based ML-CFS, the single-objective GA-ML-CFS and a baseline Binary Relevance method, using ML-kNN as the multi-label classifier. The results from our experiments show that LexGA-ML-CFS improved predictive accuracy, by comparison with other methods, in some cases, but in general there was no statistically significant different between the results of LexGA-ML-CFS and other methods

    Multi-objective variable subset selection using heterogeneous surrogate modeling and sequential design

    Get PDF

    The detection of globular clusters in galaxies as a data mining problem

    Get PDF
    We present an application of self-adaptive supervised learning classifiers derived from the Machine Learning paradigm, to the identification of candidate Globular Clusters in deep, wide-field, single band HST images. Several methods provided by the DAME (Data Mining & Exploration) web application, were tested and compared on the NGC1399 HST data described in Paolillo 2011. The best results were obtained using a Multi Layer Perceptron with Quasi Newton learning rule which achieved a classification accuracy of 98.3%, with a completeness of 97.8% and 1.6% of contamination. An extensive set of experiments revealed that the use of accurate structural parameters (effective radius, central surface brightness) does improve the final result, but only by 5%. It is also shown that the method is capable to retrieve also extreme sources (for instance, very extended objects) which are missed by more traditional approaches.Comment: Accepted 2011 December 12; Received 2011 November 28; in original form 2011 October 1
    • …
    corecore