97,837 research outputs found

    A Preliminary Study on the Use of Fuzzy Rough Set Based Feature Selection for Improving Evolutionary Instance Selection Algorithms

    Get PDF
    In recent years, the increasing interest in fuzzy rough set theory has allowed the definition of novel accurate methods for feature selection. Although their stand-alone application can lead to the construction of high quality classifiers, they can be improved even more if other preprocessing techniques, such as instance selection, are considered. With the aim of enhancing the nearest neighbor classifier, we present a hybrid algorithm for instance and feature selection, where evolutionary search in the instances’ space is combined with a fuzzy rough set based feature selection procedure. The preliminary results, contrasted through nonparametric statistical tests, suggest that our proposal can improve greatly the performance of the preprocessing techniques in isolation.Project TIN2008-06681-C06-01Spanish Ministry of EducationResearch Foundation - Flander

    Feature technology and its applications in computer integrated manufacturing

    Get PDF
    A Thesis submitted for the degree of Doctor of Philosophy of University of LutonComputer aided design and manufacturing (CAD/CAM) has been a focal research area for the manufacturing industry. Genuine CAD/CAM integration is necessary to make products of higher quality with lower cost and shorter lead times. Although CAD and CAM have been extensively used in industry, effective CAD/CAM integration has not been implemented. The major obstacles of CAD/CAM integration are the representation of design and process knowledge and the adaptive ability of computer aided process planning (CAPP). This research is aimed to develop a feature-based CAD/CAM integration methodology. Artificial intelligent techniques such as neural networks, heuristic algorithms, genetic algorithms and fuzzy logics are used to tackle problems. The activities considered include: 1) Component design based on a number of standard feature classes with validity check. A feature classification for machining application is defined adopting ISO 10303-STEP AP224 from a multi-viewpoint of design and manufacture. 2) Search of interacting features and identification of features relationships. A heuristic algorithm has been proposed in order to resolve interacting features. The algorithm analyses the interacting entity between each feature pair, making the process simpler and more efficient. 3) Recognition of new features formed by interacting features. A novel neural network-based technique for feature recognition has been designed, which solves the problems of ambiguity and overlaps. 4) Production of a feature based model for the component. 5) Generation of a suitable process plan covering selection of machining operations, grouping of machining operations and process sequencing. A hybrid feature-based CAPP has been developed using neural network, genetic algorithm and fuzzy evaluating techniques

    Two-stage hybrid feature selection algorithms for diagnosing erythemato-squamous diseases

    Get PDF
    This paper proposes two-stage hybrid feature selection algorithms to build the stable and efficient diagnostic models where a new accuracy measure is introduced to assess the models. The two-stage hybrid algorithms adopt Support Vector Machines (SVM) as a classification tool, and the extended Sequential Forward Search (SFS), Sequential Forward Floating Search (SFFS), and Sequential Backward Floating Search (SBFS), respectively, as search strategies, and the generalized F-score (GF) to evaluate the importance of each feature. The new accuracy measure is used as the criterion to evaluated the performance of a temporary SVM to direct the feature selection algorithms. These hybrid methods combine the advantages of filters and wrappers to select the optimal feature subset from the original feature set to build the stable and efficient classifiers. To get the stable, statistical and optimal classifiers, we conduct 10-fold cross validation experiments in the first stage; then we merge the 10 selected feature subsets of the 10-cross validation experiments, respectively, as the new full feature set to do feature selection in the second stage for each algorithm. We repeat the each hybrid feature selection algorithm in the second stage on the one fold that has got the best result in the first stage. Experimental results show that our proposed two-stage hybrid feature selection algorithms can construct efficient diagnostic models which have got better accuracy than that built by the corresponding hybrid feature selection algorithms without the second stage feature selection procedures. Furthermore our methods have got better classification accuracy when compared with the available algorithms for diagnosing erythemato-squamous diseases

    A novel approach to data mining using simplified swarm optimization

    Get PDF
    Data mining has become an increasingly important approach to deal with the rapid growth of data collected and stored in databases. In data mining, data classification and feature selection are considered the two main factors that drive people when making decisions. However, existing traditional data classification and feature selection techniques used in data management are no longer enough for such massive data. This deficiency has prompted the need for a new intelligent data mining technique based on stochastic population-based optimization that could discover useful information from data. In this thesis, a novel Simplified Swarm Optimization (SSO) algorithm is proposed as a rule-based classifier and for feature selection. SSO is a simplified Particle Swarm Optimization (PSO) that has a self-organising ability to emerge in highly distributed control problem space, and is flexible, robust and cost effective to solve complex computing environments. The proposed SSO classifier has been implemented to classify audio data. To the author’s knowledge, this is the first time that SSO and PSO have been applied for audio classification. Furthermore, two local search strategies, named Exchange Local Search (ELS) and Weighted Local Search (WLS), have been proposed to improve SSO performance. SSO-ELS has been implemented to classify the 13 benchmark datasets obtained from the UCI repository database. Meanwhile, SSO-WLS has been implemented in Anomaly-based Network Intrusion Detection System (A-NIDS). In A-NIDS, a novel hybrid SSO-based Rough Set (SSORS) for feature selection has also been proposed. The empirical analysis showed promising results with high classification accuracy rate achieved by all proposed techniques over audio data, UCI data and KDDCup 99 datasets. Therefore, the proposed SSO rule-based classifier with local search strategies has offered a new paradigm shift in solving complex problems in data mining which may not be able to be solved by other benchmark classifiers

    Hybrid Algorithms Based on Integer Programming for the Search of Prioritized Test Data in Software Product Lines

    Get PDF
    In Software Product Lines (SPLs) it is not possible, in general, to test all products of the family. The number of products denoted by a SPL is very high due to the combinatorial explosion of features. For this reason, some coverage criteria have been proposed which try to test at least all feature interactions without the necessity to test all products, e.g., all pairs of features (pairwise coverage). In addition, it is desirable to first test products composed by a set of priority features. This problem is known as the Prioritized Pairwise Test Data Generation Problem. In this work we propose two hybrid algorithms using Integer Programming (IP) to generate a prioritized test suite. The first one is based on an integer linear formulation and the second one is based on a integer quadratic (nonlinear) formulation. We compare these techniques with two state-of-the-art algorithms, the Parallel Prioritized Genetic Solver (PPGS) and a greedy algorithm called prioritized-ICPL. Our study reveals that our hybrid nonlinear approach is clearly the best in both, solution quality and computation time. Moreover, the nonlinear variant (the fastest one) is 27 and 42 times faster than PPGS in the two groups of instances analyzed in this work.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. Partially funded by the Spanish Ministry of Economy and Competitiveness and FEDER under contract TIN2014-57341-R, the University of Málaga, Andalucía Tech and the Spanish Network TIN2015-71841-REDT (SEBASENet)
    • …
    corecore