625 research outputs found

    MIMR-DGSA: unsupervised hyperspectral band selection based on information theory and a modified discrete gravitational search algorithm

    Get PDF
    Band selection plays an important role in hyperspectral data analysis as it can improve the performance of data analysis without losing information about the constitution of the underlying data. We propose a MIMR-DGSA algorithm for band selection by following the Maximum-Information-Minimum-Redundancy (MIMR) criterion that maximises the information carried by individual features of a subset and minimises redundant information between them. Subsets are generated with a modified Discrete Gravitational Search Algorithm (DGSA) where we definine a neighbourhood concept for feature subsets. A fast algorithm for pairwise mutual information calculation that incorporates variable bandwidths of hyperspectral bands called VarBWFastMI is also developed. Classification results on three hyperspectral remote sensing datasets show that the proposed MIMR-DGSA performs similar to the original MIMR with Clonal Selection Algorithm (CSA) but is computationally more efficient and easier to handle as it has fewer parameters for tuning

    Feature Selection Approach based on Firefly Algorithm and Chi-square

    Get PDF
    Dimensionality problem is a well-known challenging issue for most classifiers in which datasets have unbalanced number of samples and features. Features may contain unreliable data which may lead the classification process to produce undesirable results. Feature selection approach is considered a solution for this kind of problems. In this paperan enhanced firefly algorithm is proposed to serve as a feature selection solution for reducing dimensionality and picking the most informative features to be used in classification. The main purpose of the proposedmodel is to improve the classification accuracy through using the selected features produced from the model, thus classification errors will decrease. Modeling firefly in this research appears through simulating firefly position by cell chi-square value which is changed after every move, and simulating firefly intensity by calculating a set of different fitness functionsas a weight for each feature. K-nearest neighbor and Discriminant analysis are used as classifiers to test the proposed firefly algorithm in selecting features. Experimental results showed that the proposed enhanced algorithmbased on firefly algorithm with chi-square and different fitness functions can provide better results than others. Results showed that reduction of dataset is useful for gaining higher accuracy in classification

    Integrated bio-search approaches with multi-objective algorithms for optimization and classification problem

    Get PDF
    Optimal selection of features is very difficult and crucial to achieve, particularly for the task of classification. It is due to the traditional method of selecting features that function independently and generated the collection of irrelevant features, which therefore affects the quality of the accuracy of the classification. The goal of this paper is to leverage the potential of bio-inspired search algorithms, together with wrapper, in optimizing multi-objective algorithms, namely ENORA and NSGA-II to generate an optimal set of features. The main steps are to idealize the combination of ENORA and NSGA-II with suitable bio-search algorithms where multiple subset generation has been implemented. The next step is to validate the optimum feature set by conducting a subset evaluation. Eight (8) comparison datasets of various sizes have been deliberately selected to be checked. Results shown that the ideal combination of multi-objective algorithms, namely ENORA and NSGA-II, with the selected bio-inspired search algorithm is promising to achieve a better optimal solution (i.e. a best features with higher classification accuracy) for the selected datasets. This discovery implies that the ability of bio-inspired wrapper/filtered system algorithms will boost the efficiency of ENORA and NSGA-II for the task of selecting and classifying features

    Feature Selection for Document Classification : Case Study of Meta-heuristic Intelligence and Traditional Approaches

    Get PDF
    Doctor of Philosophy (Computer Engineering), 2020Nowadays, the culture for accessing news around the world is changed from paper to electronic format and the rate of publication for newspapers and magazines on website are increased dramatically. Meanwhile, text feature selection for the automatic document classification (ADC) is becoming a big challenge because of the unstructured nature of text feature, which is called “multi-dimension feature problem”. On the other hand, various powerful schemes dealing with text feature selection are being developed continuously nowadays, but there still exists a research gap for “optimization of feature selection problem (OFSP)”, which can be looked for the global optimal features. Meanwhile, the capacity of meta-heuristic intelligence for knowledge discovery process (KDP) is also become the critical role to overcome NP-hard problem of OFSP by providing effective performance and efficient computation time. Therefore, the idea of meta-heuristic based approach for optimization of feature selection is proposed in this research to search the global optimal features for ADC. In this thesis, case study of meta-heuristic intelligence and traditional approaches for feature selection optimization process in document classification is observed. It includes eleven meta-heuristic algorithms such as Ant Colony search, Artificial Bee Colony search, Bat search, Cuckoo search, Evolutionary search, Elephant search, Firefly search, Flower search, Genetic search, Rhinoceros search, and Wolf search, for searching the optimal feature subset for document classification. Then, the results of proposed model are compared with three traditional search algorithms like Best First search (BFS), Greedy Stepwise (GS), and Ranker search (RS). In addition, the framework of data mining is applied. It involves data preprocessing, feature engineering, building learning model and evaluating the performance of proposed meta-heuristic intelligence-based feature selection using various performance and computation complexity evaluation schemes. In data processing, tokenization, stop-words handling, stemming and lemmatizing, and normalization are applied. In feature engineering process, n-gram TF-IDF feature extraction is used for implementing feature vector and both filter and wrapper approach are applied for observing different cases. In addition, three different classifiers like J48, NaÃŊve Bayes, and Support Vector Machine, are used for building the document classification model. According to the results, the proposed system can reduce the number of selected features dramatically that can deteriorate learning model performance. In addition, the selected global subset features can yield better performance than traditional search according to single objective function of proposed model

    Applied (Meta)-Heuristic in Intelligent Systems

    Get PDF
    Engineering and business problems are becoming increasingly difficult to solve due to the new economics triggered by big data, artificial intelligence, and the internet of things. Exact algorithms and heuristics are insufficient for solving such large and unstructured problems; instead, metaheuristic algorithms have emerged as the prevailing methods. A generic metaheuristic framework guides the course of search trajectories beyond local optimality, thus overcoming the limitations of traditional computation methods. The application of modern metaheuristics ranges from unmanned aerial and ground surface vehicles, unmanned factories, resource-constrained production, and humanoids to green logistics, renewable energy, circular economy, agricultural technology, environmental protection, finance technology, and the entertainment industry. This Special Issue presents high-quality papers proposing modern metaheuristics in intelligent systems

    Community Detection in Networks using Bio-inspired Optimization: Latest Developments, New Results and Perspectives with a Selection of Recent Meta-Heuristics

    Get PDF
    Detecting groups within a set of interconnected nodes is a widely addressed prob- lem that can model a diversity of applications. Unfortunately, detecting the opti- mal partition of a network is a computationally demanding task, usually conducted by means of optimization methods. Among them, randomized search heuristics have been proven to be efficient approaches. This manuscript is devoted to pro- viding an overview of community detection problems from the perspective of bio-inspired computation. To this end, we first review the recent history of this research area, placing emphasis on milestone studies contributed in the last five years. Next, we present an extensive experimental study to assess the performance of a selection of modern heuristics over weighted directed network instances. Specifically, we combine seven global search heuristics based on two different similarity metrics and eight heterogeneous search operators designed ad-hoc. We compare our methods with six different community detection techniques over a benchmark of 17 Lancichinetti-Fortunato-Radicchi network instances. Ranking statistics of the tested algorithms reveal that the proposed methods perform com- petitively, but the high variability of the rankings leads to the main conclusion: no clear winner can be declared. This finding aligns with community detection tools available in the literature that hinge on a sequential application of different algorithms in search for the best performing counterpart. We end our research by sharing our envisioned status of this area, for which we identify challenges and opportunities which should stimulate research efforts in years to come

    Anomaly Identification Model for Telecom Users Based on Machine Learning Model Fusion

    Get PDF
    With the development of economic globalization and modern information and communication technology, the situation of communication fraud is becoming more and more serious. How to identify fraudulent calls accurately and effectively has become an urgent task in current telecommunications operations. Affected by the sample set and the current state of the art, the current machine learning methods used to identify the imbalanced distribution dataset of positive and negative samples have low recognition accuracy. Therefore, in this paper, we propose a new hybrid model solution that uses feature construction, feature selection and imbalanced classes handling. A stacking model fusion algorithm composed of a two-layer stacking framework with several state-of-the-art machine learning classifiers is adopted. The results show that the risk user identification model based on mobile network communication behavior established by our stacking model fusion algorithm can accurately predict the category labels of telecom users and improve the risk of telecom users. The generalization performance of the identification is high, which provides a certain reference for the telecommunications industry to identify risk users based on mobile network communication behaviors
    • â€Ķ
    corecore