23 research outputs found

    Bi-Directional Feature Fixation-Based Particle Swarm Optimization for Large-Scale Feature Selection

    Get PDF
    Feature selection, which aims to improve the classification accuracy and reduce the size of the selected feature subset, is an important but challenging optimization problem in data mining. Particle swarm optimization (PSO) has shown promising performance in tackling feature selection problems, but still faces challenges in dealing with large-scale feature selection in Big Data environment because of the large search space. Hence, this article proposes a bi-directional feature fixation (BDFF) framework for PSO and provides a novel idea to reduce the search space in large-scale feature selection. BDFF uses two opposite search directions to guide particles to adequately search for feature subsets with different sizes. Based on the two different search directions, BDFF can fix the selection states of some features and then focus on the others when updating particles, thus narrowing the large search space. Besides, a self-adaptive strategy is designed to help the swarm concentrate on a more promising direction for search in different stages of evolution and achieve a balance between exploration and exploitation. Experimental results on 12 widely-used public datasets show that BDFF can improve the performance of PSO on large-scale feature selection and obtain smaller feature subsets with higher classification accuracy

    An intelligent decision support system for acute lymphoblastic leukaemia detection

    Get PDF
    The morphological analysis of blood smear slides by haematologists or haematopathologists is one of the diagnostic procedures available to evaluate the presence of acute leukaemia. This operation is a complex and costly process, and often lacks standardized accuracy owing to a variety of factors, including insufficient expertise and operator fatigue. This research proposes an intelligent decision support system for automatic detection of acute lymphoblastic leukaemia (ALL) using microscopic blood smear images to overcome the above barrier. The work has four main key stages. (1) Firstly, a modified marker-controlled watershed algorithm integrated with the morphological operations is proposed for the segmentation of the membrane of the lymphocyte and lymphoblast cell images. The aim of this stage is to isolate a lymphocyte/lymphoblast cell membrane from touching and overlapping of red blood cells, platelets and artefacts of the microscopic peripheral blood smear sub-images. (2) Secondly, a novel clustering algorithm with stimulating discriminant measure (SDM) of both within- and between-cluster scatter variances is proposed to produce robust segmentation of the nucleus and cytoplasm of lymphocytic cell membranes. The SDM measures are used in conjunction with Genetic Algorithm for the clustering of nucleus, cytoplasm, and background regions. (3) Thirdly, a total of eighty features consisting of shape, texture, and colour information from the nucleus and cytoplasm of the identified lymphocyte/lymphoblast images are extracted. (4) Finally, the proposed feature optimisation algorithm, namely a variant of Bare-Bones Particle Swarm Optimisation (BBPSO), is presented to identify the most significant discriminative characteristics of the nucleus and cytoplasm segmented by the SDM-based clustering algorithm. The proposed BBPSO variant algorithm incorporates Cuckoo Search, Dragonfly Algorithm, BBPSO, and local and global random walk operations of uniform combination, and Lévy flights to diversify the search and mitigate the premature convergence problem of the conventional BBPSO. In addition, it also employs subswarm concepts, self-adaptive parameters, and convergence degree monitoring mechanisms to enable fast convergence. The optimal feature subsets identified by the proposed algorithm are subsequently used for ALL detection and classification. The proposed system achieves the highest classification accuracy of 96.04% and significantly outperforms related meta-heuristic search methods and related research for ALL detection

    Cooperative Particle Swarm Optimization for Combinatorial Problems

    Get PDF
    A particularly successful line of research for numerical optimization is the well-known computational paradigm particle swarm optimization (PSO). In the PSO framework, candidate solutions are represented as particles that have a position and a velocity in a multidimensional search space. The direct representation of a candidate solution as a point that flies through hyperspace (i.e., Rn) seems to strongly predispose the PSO toward continuous optimization. However, while some attempts have been made towards developing PSO algorithms for combinatorial problems, these techniques usually encode candidate solutions as permutations instead of points in search space and rely on additional local search algorithms. In this dissertation, I present extensions to PSO that by, incorporating a cooperative strategy, allow the PSO to solve combinatorial problems. The central hypothesis is that by allowing a set of particles, rather than one single particle, to represent a candidate solution, combinatorial problems can be solved by collectively constructing solutions. The cooperative strategy partitions the problem into components where each component is optimized by an individual particle. Particles move in continuous space and communicate through a feedback mechanism. This feedback mechanism guides them in the assessment of their individual contribution to the overall solution. Three new PSO-based algorithms are proposed. Shared-space CCPSO and multispace CCPSO provide two new cooperative strategies to split the combinatorial problem, and both models are tested on proven NP-hard problems. Multimodal CCPSO extends these combinatorial PSO algorithms to efficiently sample the search space in problems with multiple global optima. Shared-space CCPSO was evaluated on an abductive problem-solving task: the construction of parsimonious set of independent hypothesis in diagnostic problems with direct causal links between disorders and manifestations. Multi-space CCPSO was used to solve a protein structure prediction subproblem, sidechain packing. Both models are evaluated against the provable optimal solutions and results show that both proposed PSO algorithms are able to find optimal or near-optimal solutions. The exploratory ability of multimodal CCPSO is assessed by evaluating both the quality and diversity of the solutions obtained in a protein sequence design problem, a highly multimodal problem. These results provide evidence that extended PSO algorithms are capable of dealing with combinatorial problems without having to hybridize the PSO with other local search techniques or sacrifice the concept of particles moving throughout a continuous search space

    Derating NichePSO

    Get PDF
    The search for multiple solutions is applicable to many fields (Engineering [54][67], Science [75][80][79][84][86], Economics [13][59], and others [51]). Multiple solutions allow for human judgement to select the best solution from a group of solutions that best match the search criteria. Finding multiple solutions to an optimisation problem has shown to be difficult to solve. Evolutionary computation (EC) and more recently Particle Swarm Optimisation (PSO) algorithms have been used in this field to locate and maintain multiple solutions with fair success. This thesis develops and empirically analyses a new method to find multiple solutions within a convoluted search space. The method is a hybrid of the NichePSO [14] and the sequential niche technique (SNT)[8]. The original SNT was developed using a Genetic Algorithm (GA). It included restrictions such as knowing or approximating the number of solutions that exist. A further pitfall of the SNT is that it introduces false optima after modifying the search space, thereby reducing the accuracy of the solutions. However, this can be resolved with a local search in the unmodified search space. Other sequential niching algorithms require that the search be repeated sequentially until all solutions are found without considering what was learned in previous iterations, resulting in a blind and wasteful search. The NichePSO has shown to be more accurate than GA based algorithms [14][15]. It does not require knowledge of the number of solutions in the search space prior to the search process. However, the NichePSO does not scale well for problems with many optima [16]. The method developed in this thesis, referred to as the derating NichePSO, combines SNT with the NichePSO. The main objective of the derating NichePSO is to eliminate the inaccuracy of SNT and to improve the scalability of the NichePSO. The derating NichePSO is compared to the NichePSO, deterministic crowding [23] and the original SNT using various multimodal functions. The performance of the derating NichePSO is analysed and it is shown that the derating NichePSO is more accurate than SNT and more scalable than the NichePSO.Dissertation (MSc)--University of Pretoria, 2007.Computer ScienceMScUnrestricte

    DEFEG: deep ensemble with weighted feature generation.

    Get PDF
    With the significant breakthrough of Deep Neural Networks in recent years, multi-layer architecture has influenced other sub-fields of machine learning including ensemble learning. In 2017, Zhou and Feng introduced a deep random forest called gcForest that involves several layers of Random Forest-based classifiers. Although gcForest has outperformed several benchmark algorithms on specific datasets in terms of classification accuracy and model complexity, its input features do not ensure better performance when going deeply through layer-by-layer architecture. We address this limitation by introducing a deep ensemble model with a novel feature generation module. Unlike gcForest where the original features are concatenated to the outputs of classifiers to generate the input features for the subsequent layer, we integrate weights on the classifiers’ outputs as augmented features to grow the deep model. The usage of weights in the feature generation process can adjust the input data of each layer, leading the better results for the deep model. We encode the weights using variable-length encoding and develop a variable-length Particle Swarm Optimisation method to search for the optimal values of the weights by maximizing the classification accuracy on the validation data. Experiments on a number of UCI datasets confirm the benefit of the proposed method compared to some well-known benchmark algorithms

    Adaptive techniques for enhancing the robustness and performance of speciated PSOs in multimodal environments

    Get PDF
    This thesis proposes several new techniques to improve the performance of speciated particle swarms in multimodal environments. We investigate how these algorithms can become more robust and adaptive, easier to use and able to solve a wider variety of optimisation problems. We then develop a technique that uses regression to vastly improve an algorithm's convergence speed without requiring extra evaluations. Speciation techniques play an important role in particle swarms. They allow an algorithm to locate multiple optima, providing the user with a choice of solutions. Speciation also provides diversity preservation, which can be critical for dynamic optimisation. By increasing diversity and tracking multiple peaks simultaneously, speciated algorithms are better able to handle the changes inherent in dynamic environments. Speciation algorithms often require a user to specify a parameter that controls how species form. This is a major drawback since the knowledge may not be available a priori. If the parameter is incorrectly set, the algorithm's performance is likely to be highly degraded. We propose using a time-based measure to control the speciation, allowing the algorithm to define species far more adaptively, using the population's characteristics and behaviour to control membership. Two new techniques presented in this thesis, ANPSO and ESPSO, use time-based convergence measures to define species. These methods are shown to be robust while still providing highly competitive performance. Both algorithms effectively optimised all of our test functions without requiring any tuning. Speciated algorithms are ideally suited to optimising dynamic environments, however the complexity of these environments makes them far more difficult to design algorithms for. To increase an algorithm's performance it is necessary to determine in what ways it should be improved. While all performance metrics allow optimisation techniques to be compared, they cannot show how to improve an algorithm. Until now this has been done largely by trial and error. This is extremely inefficient, in the same way it is inefficient trying to improve a program's speed without profiling it first. This thesis proposes a new metric that exclusively measures convergence speed. We show that an algorithm can be profiled by correlating the performance as measured by multiple metrics. By combining these two techniques, we can obtain far better insight into how best to improve an algorithm. Using this information, we then propose a local convergence enhancement that greatly increases performance by actively estimating the location of an optimum. The enhancement uses regression to fit a surface to the peak, guiding the search by estimating the peak's true location. By incorporating this technique, the algorithm is able to use the information contained within the fitness landscape far more effectively. We show that by combining the regression with an existing speciated algorithm, we are able to vastly improve the algorithm's performance. This technique will greatly enhance the utility of PSO on problems where fitness evaluations are expensive, or that require fast reaction to change

    Evolving Ensemble Models for Image Segmentation Using Enhanced Particle Swarm Optimization

    Get PDF
    In this paper, we propose particle swarm optimization (PSO)-enhanced ensemble deep neural networks and hybrid clustering models for skin lesion segmentation. A PSO variant is proposed, which embeds diverse search actions including simulated annealing, levy flight, helix behavior, modified PSO, and differential evolution operations with spiral search coefficients. These search actions work in a cascade manner to not only equip each individual with different search operations throughout the search process but also assign distinctive search actions to different particles simultaneously in every single iteration. The proposed PSO variant is used to optimize the learning hyper-parameters of convolutional neural networks (CNNs) and the cluster centroids of classical Fuzzy C-Means clustering respectively to overcome performance barriers. Ensemble deep networks and hybrid clustering models are subsequently constructed based on the optimized CNN and hybrid clustering segmenters for lesion segmentation. We evaluate the proposed ensemble models using three skin lesion databases, i.e., PH2, ISIC 2017, and Dermofit Image Library, and a blood cancer data set, i.e., ALL-IDB2. The empirical results indicate that our models outperform other hybrid ensemble clustering models combined with advanced PSO variants, as well as state-of-the-art deep networks in the literature for diverse challenging image segmentation tasks

    Multi objective particle swarm optimization: algorithms and applications

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Applied Metaheuristic Computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC
    corecore