604 research outputs found

    An Oversampling Mechanism for Multimajority Datasets using SMOTE and Darwinian Particle Swarm Optimisation

    Get PDF
    Data skewness continues to be one of the leading factors which adversely impacts the machine learning algorithms performance. An approach to reduce this negative effect of the data variance is to pre-process the former dataset with data level resampling strategies. Resampling strategies have been seen in two forms, oversampling and undersampling. An oversampling strategy is proposed in this article for tackling multiclass imbalanced datasets. This proposed approach optimises the state-of-the-art oversampling technique SMOTE with the Darwinian Particle Swarm Optimization technique. This proposed method DOSMOTE generates synthetic optimised samples for balancing the datasets. This strategy will be more effective on multimajority datasets.  An experimental study is performed on peculiar multimajority datasets to measure the effectiveness of the proposed approach. As a result, the proposed method produces promising results when compared to the conventional oversampling strategies

    A Novel Evolutionary Swarm Fuzzy Clustering Approach for Hyperspectral Imagery

    Get PDF
    In land cover assessment, classes often gradually change from one to another. Therefore, it is difficult to allocate sharp boundaries between different classes of interest. To overcome this issue and model such conditions, fuzzy techniques that resemble human reasoning have been proposed as alternatives. Fuzzy C-means is the most common fuzzy clustering technique, but its concept is based on a local search mechanism and its convergence rate is rather slow, especially considering high-dimensional problems (e.g., in processing of hyperspectral images). Here, in order to address those shortcomings of hard approaches, a new approach is proposed, i.e., fuzzy C-means which is optimized by fractional order Darwinian particle swarm optimization. In addition, to speed up the clustering process, the histogram of image intensities is used during the clustering process instead of the raw image data. Furthermore, the proposed clustering approach is combined with support vector machine classification to accurately classify hyperspectral images. The new classification framework is applied on two well-known hyperspectral data sets; Indian Pines and Salinas. Experimental results confirm that the proposed swarm-based clustering approach can group hyperspectral images accurately in a time-efficient manner compared to other existing clustering techniques.PostPrin

    Crafting Adversarial Examples using Particle Swarm Optimization

    Get PDF
    Machine learning models have been found to be vulnerable to adversarial attacks that apply small perturbations to input samples to get them misclassified. Attacks that search for and apply the perturbations are performed in both white-box and black-box settings, depending on the information available to the attacker about the target. For black-box attacks, the attacker can only query the target with specially crafted inputs and observing the outputs returned by the model. These outputs are used to guide the perturbations and create adversarial examples that are then misclassified. Current black-box attacks on API-based malware classifiers rely solely on feature insertion when applying perturbations. This restriction is set in place to ensure that no changes are introduced to the malware\u27s originally intended functionality. Additionally, the API calls being inserted in the malware are null or no-op APIs that have no functional affect to avoid any unintentional impact on malware behavior. Due to the nature of these API calls, they can be easily detected through non-ML techniques by analyzing their arguments and return values. In this dissertation, we explore other attacks on API-based malware detection models that are not restricted to feature addition. Specifically, we explore feature replacement as a possible avenue for creating adversarial malware examples. To retain the malware\u27s original functionality, we replace API calls with other functionally equivalent API calls. We find the API alternatives by using a hierarchical unsupervised learning approach on the API\u27s documentation. Our attack, which we call AdversarialPSO, uses Particle Swarm Optimization to guide the perturbations according to available function alternatives. Results show that creating adversarial malware examples by feature replacement is possible even under the more restrictive search space of limited function alternatives. Unlike the malware domain, which lacks benchmark datasets and publicly available classification models, image classification has multiple benchmarks to test new attacks. Therefore, to evaluate the efficacy and wide-applicability of AdversarialPSO, we re-implement the attack in the image classification domain, where we create adversarial examples from images by adding small often unrecognizable perturbations to the inputs. As a result of these perturbations, highly-accurate models misclassify the inputs resulting in a drastic drop in their accuracy. We evaluate this attack against both defended and undefended models and show that AdversarialPSO performs comparably to state-of-the-art adversarial attacks

    Dynamic optimization of classification systems for adaptive incremental learning.

    Get PDF
    Tese de Doutorado, defendida na Université Du Québec, Canadian. 2010An incremental learning system updates itself in response to incoming data without reexamining all the old data. Since classification systems capable of incrementally storing, filtering, and classifying data are economical, in terms of both space and time, which makes them immensely useful for industrial, military, and commercial purposes, interest in designing them is growing. However, the challenge with incremental learning is that classification tasks can no longer be seen as unvarying, since they can actually change with the evolution of the data. These changes in turn cause dynamic changes to occur in the classification system’s parameters If such variations are neglected, the overall performance of these systems will be compromised in the future. In this thesis, on the development of a system capable of incrementally accommodating new data and dynamically tracking new optimum system parameters for self-adaptation, we first address the optimum selection of classifiers over time. We propose a framework which combines the power of Swarm Intelligence Theory and the conventional grid-search method to progressively identify potential solutions for gradually updating training datasets. The key here is to consider the adjustment of classifier parameters as a dynamic optimization problem that depends on the data available. Specifically, it has been shown that, if the intention is to build efficient Support Vector Machine (SVM) classifiers from sources that provide data gradually and serially, then the best way to do this is to consider model selection as a dynamic process which can evolve and change over time. This means that a number of solutions are required, depending on the knowledge available about the problem and uncertainties in the data. We also investigate measures for evaluating and selecting classifier ensembles composed of SVM classifiers. The measures employed are based on two different theories (diversity and margin) commonly used to understand the success of ensembles. This study has given us valuable insights and helped us to establish confidence-based measures as a tool for the selection of classifier ensembles. The main contribution of this thesis is a dynamic optimization approach that performs incremental learning in an adaptive fashion by tracking, evolving, and combining optimum hypotheses over time. The approach incorporates various theories, such as dynamic Particle Swarm Optimization, incremental Support Vector Machine classifiers, change detection, and dynamic ensemble selection based on classifier confidence levels. Experiments carried out on synthetic and real-world databases demonstrate that the proposed approach outperforms the classification methods often used in incremental learning scenarios

    Multidimensional Particle Swarm Optimization for Machine Learning

    Get PDF
    Particle Swarm Optimization (PSO) is a stochastic nature-inspired optimization method. It has been successfully used in several application domains since it was introduced in 1995. It has been especially successful when applied to complicated multimodal problems, where simpler optimization methods, e.g., gradient descent, are not able to find satisfactory results. Multidimensional Particle Swarm Optimization (MD-PSO) and Fractional Global Best Formation (FGBF) are extensions of the basic PSO. MD-PSO allows searching for an optimum also when the solution dimensionality is unknown. With a dedicated dimensional PSO process, MD-PSO can search for optimal solution dimensionality. An interleaved positional PSO process simultaneously searches for the optimal solution in that dimensionality. Both the basic PSO and its multidimensional extension MD-PSO are susceptible to premature convergence. FGBF is a plug-in to (MD-)PSO that can help avoid premature convergence and find desired solutions faster. This thesis focuses on applications of MD-PSO and FGBF in different machine learning tasks.Multiswarm versions of MD-PSO and FGBF are introduced to perform dynamic optimization tasks. In dynamic optimization, the search space slowly changes. The locations of optima move and a former local optimum may transform into a global optimum and vice versa. We exploit multiple swarms to track different optima.In order to apply MD-PSO for clustering tasks, two key questions need to be answered: 1) How to encode the particles to represent different data partitions? 2) How to evaluate the fitness of the particles to evaluate the quality of the solutions proposed by the particle positions? The second question is considered especially carefully in this thesis. An extensive comparison of Clustering Validity Indices (CVIs) commonly used as fitness functions in Particle Swarm Clustering (PSC) is conducted. Furthermore, a novel approach to carry out fitness evaluation, namely Fitness Evaluation with Computational Centroids (FECC) is introduced. FECC gives the same fitness to any particle positions that lead to the same data partition. Therefore, it may save some computational efforts and, above all, it can significantly improve the results obtained by using any of the best performing CVIs as the PSC fitness function.MD-PSO can also be used to evolve different neural networks. The results of training Multilayer Perceptrons (MLPs) using the common Backpropagation (BP) algorithm and a global technique based on PSO are compared. The pros and cons of BP and (MD-)PSO in MLP training are discussed. For training Radial Basis Function Neural Networks (RBFNNs), a novel technique based on class-specific clustering of the training samples is introduced. The proposed approach is compared to the common input and input-output clustering approaches and the benefits of using the class-specific approach are experimentally demonstrated. With the class-specific approach, the training complexity is reduced, while the classification performance of the trained RBFNNs may be improved.Collective Network of Binary Classifiers (CNBC) is an evolutionary semantic classifier consisting of several Networks of Binary Classifiers (NBCs) trained to recognize a certain semantic class. NBCs in turn consist of several Binary Classifiers (BCs), which are trained for a certain feature type. Thanks to its topology and the use of MD-PSO as its evolution technique, incremental training can be easily applied to add new training items, classes, and/or features.In feature synthesis, the objective is to exploit ground truth information to transform the original low-level features into more discriminative ones. To learn an efficient synthesis for a dataset, only a fraction of the data needs to be labeled. The learned synthesis can then be applied on unlabeled data to improve classification or retrieval results. In this thesis, two different feature synthesis techniques are introduced. In the first one, MD-PSO is directly used to find proper arithmetic operations to be applied on the elements of the original low-level feature vectors. In the second approach, feature synthesis is carried out using one-against-all perceptrons. In the latter technique, the best results were obtained when MD-PSO was used to train the perceptrons.In all the mentioned applications excluding MLP training, MD-PSO is used together with FGBF. Overall, MD-PSO and FGBF are indeed versatile tools in machine learning. However, computational limitations constrain their use in currently emerging machine learning systems operating on Big Data. Therefore, in the future, it is necessary to divide complex tasks into smaller subproblems and to conquer the large problems via solving the subproblems where the use of MD-PSO and FGBF becomes feasible. Several applications discussed in this thesis already exploit the divide-and-conquer operation model

    Amplifying the Prediction of Team Performance through Swarm Intelligence and Machine Learning

    Get PDF
    Modern companies are increasingly relying on groups of individuals to reach organizational goals and objectives, however many organizations struggle to cultivate optimal teams that can maximize performance. Fortunately, existing research has established that group personality composition (GPC), across five dimensions of personality, is a promising indicator of team effectiveness. Additionally, recent advances in technology have enabled groups of humans to form real-time, closed-loop systems that are modeled after natural swarms, like flocks of birds and colonies of bees. These Artificial Swarm Intelligences (ASI) have been shown to amplify performance in a wide range of tasks, from forecasting financial markets to prioritizing conflicting objectives. The present research examines the effects of group personality composition on team performance and investigates the impact of measuring GPC through ASI systems. 541 participants, across 111 groups, were administered a set of well-accepted and vetted psychometric assessments to capture the personality configurations and social sensitivities of teams. While group-level personality averages explained 10% of the variance in team performance, when group personality composition was measured through human swarms, it was able to explain 29% of the variance, representing a 19% amplification in predictive capacity. Finally, a series of machine learning models were applied and trained to predict group effectiveness. Multivariate Linear Regression and Logistic Regression achieved the highest performance exhibiting 0.19 mean squared error and 81.8% classification accuracy

    Pixel Classification of SAR ice images using ANFIS-PSO Classifier

    Get PDF
    Synthetic Aperture Radar (SAR) is playing a vital role in taking extremely high resolution radar images. It is greatly used to monitor the ice covered ocean regions. Sea monitoring is important for various purposes which includes global climate systems and ship navigation. Classification on the ice infested area gives important features which will be further useful for various monitoring process around the ice regions. Main objective of this paper is to classify the SAR ice image that helps in identifying the regions around the ice infested areas. In this paper three stages are considered in classification of SAR ice images. It starts with preprocessing in which the speckled SAR ice images are denoised using various speckle removal filters; comparison is made on all these filters to find the best filter in speckle removal. Second stage includes segmentation in which different regions are segmented using K-means and watershed segmentation algorithms; comparison is made between these two algorithms to find the best in segmenting SAR ice images. The last stage includes pixel based classification which identifies and classifies the segmented regions using various supervised learning classifiers. The algorithms includes Back propagation neural networks (BPN), Fuzzy Classifier, Adaptive Neuro Fuzzy Inference Classifier (ANFIS) classifier and proposed ANFIS with Particle Swarm Optimization (PSO) classifier; comparison is made on all these classifiers to propose which classifier is best suitable for classifying the SAR ice image. Various evaluation metrics are performed separately at all these three stages

    Self-Selective Correlation Ship Tracking Method for Smart Ocean System

    Full text link
    In recent years, with the development of the marine industry, navigation environment becomes more complicated. Some artificial intelligence technologies, such as computer vision, can recognize, track and count the sailing ships to ensure the maritime security and facilitates the management for Smart Ocean System. Aiming at the scaling problem and boundary effect problem of traditional correlation filtering methods, we propose a self-selective correlation filtering method based on box regression (BRCF). The proposed method mainly include: 1) A self-selective model with negative samples mining method which effectively reduces the boundary effect in strengthening the classification ability of classifier at the same time; 2) A bounding box regression method combined with a key points matching method for the scale prediction, leading to a fast and efficient calculation. The experimental results show that the proposed method can effectively deal with the problem of ship size changes and background interference. The success rates and precisions were higher than Discriminative Scale Space Tracking (DSST) by over 8 percentage points on the marine traffic dataset of our laboratory. In terms of processing speed, the proposed method is higher than DSST by nearly 22 Frames Per Second (FPS)
    corecore