93 research outputs found

    Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

    Get PDF
    The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

    Stochastic local search: a state-of-the-art review

    Get PDF
    The main objective of this paper is to provide a state-of-the-art review, analyze and discuss stochastic local search techniques used for solving hard combinatorial problems. It begins with a short introduction, motivation and some basic notation on combinatorial problems, search paradigms and other relevant features of searching techniques as needed for background. In the following a brief overview of the stochastic local search methods along with an analysis of the state-of-the-art stochastic local search algorithms is given. Finally, the last part of the paper present and discuss some of the most latest trends in application of stochastic local search algorithms in machine learning, data mining and some other areas of science and engineering. We conclude with a discussion on capabilities and limitations of stochastic local search algorithms

    Metaheuristic Design Patterns: New Perspectives for Larger-Scale Search Architectures

    Get PDF
    Design patterns capture the essentials of recurring best practice in an abstract form. Their merits are well established in domains as diverse as architecture and software development. They offer significant benefits, not least a common conceptual vocabulary for designers, enabling greater communication of high-level concerns and increased software reuse. Inspired by the success of software design patterns, this chapter seeks to promote the merits of a pattern-based method to the development of metaheuristic search software components. To achieve this, a catalog of patterns is presented, organized into the families of structural, behavioral, methodological and component-based patterns. As an alternative to the increasing specialization associated with individual metaheuristic search components, the authors encourage computer scientists to embrace the ‘cross cutting' benefits of a pattern-based perspective to optimization algorithms. Some ways in which the patterns might form the basis of further larger-scale metaheuristic component design automation are also discussed

    Business intelligence framework using ant colony optimization for feature selection in higher education institution

    Get PDF
    Recently, business intelligence (BI) has become an important tool for effective decision-making. BI is a mathematical framework to gain information and knowledge through the process of extracting, transforming, managing, and analyzing data. The demand for accurate knowledge in higher education sector needs a correct technique to extract the exact information for decision-making. However, current BI frameworks and systems lack the ability to transform data into information, and these caused users not to able to fully utilize the BI outcome. This research developed a BI framework for the higher education that is able to explore, analyse and visualize the relevant data into information for use by the top management. This framework identifies the best set of attributes and evaluates the performance of the model with the help of 27 input features. In this case study, the framework used Ant Colony Optimization (ACO) technique mainly to identify the best set of attributes, and the performance was validated using Support Vector Machine (SVM). The framework consists of four layers which are data source, data integration, analytic, and access. Each layer contributes to decision making in terms of processing data, selection of significant features and data visualization. In this study, 46,658 input data were processed for identification of Graduate on Time (GOT) decision in the context of higher education referred as Masters and Doctor of Philosophy (PhD) postgraduates who completed their study within a specified period. The performance evaluation of the data achieved accuracies of 86.44% for PhD and 96.2% for Master’s. Based on the findings, the results showed that the BI dashboard as an output from the framework is capable of providing a good decision-making tool for education management

    Computational Optimizations for Machine Learning

    Get PDF
    The present book contains the 10 articles finally accepted for publication in the Special Issue “Computational Optimizations for Machine Learning” of the MDPI journal Mathematics, which cover a wide range of topics connected to the theory and applications of machine learning, neural networks and artificial intelligence. These topics include, among others, various types of machine learning classes, such as supervised, unsupervised and reinforcement learning, deep neural networks, convolutional neural networks, GANs, decision trees, linear regression, SVM, K-means clustering, Q-learning, temporal difference, deep adversarial networks and more. It is hoped that the book will be interesting and useful to those developing mathematical algorithms and applications in the domain of artificial intelligence and machine learning as well as for those having the appropriate mathematical background and willing to become familiar with recent advances of machine learning computational optimization mathematics, which has nowadays permeated into almost all sectors of human life and activity

    Water filtration by using apple and banana peels as activated carbon

    Get PDF
    Water filter is an important devices for reducing the contaminants in raw water. Activated from charcoal is used to absorb the contaminants. Fruit peels are some of the suitable alternative carbon to substitute the charcoal. Determining the role of fruit peels which were apple and banana peels powder as activated carbon in water filter is the main goal. Drying and blending the peels till they become powder is the way to allow them to absorb the contaminants. Comparing the results for raw water before and after filtering is the observation. After filtering the raw water, the reading for pH was 6.8 which is in normal pH and turbidity reading recorded was 658 NTU. As for the colour, the water becomes more clear compared to the raw water. This study has found that fruit peels such as banana and apple are an effective substitute to charcoal as natural absorbent

    Feature Selection for Document Classification : Case Study of Meta-heuristic Intelligence and Traditional Approaches

    Get PDF
    Doctor of Philosophy (Computer Engineering), 2020Nowadays, the culture for accessing news around the world is changed from paper to electronic format and the rate of publication for newspapers and magazines on website are increased dramatically. Meanwhile, text feature selection for the automatic document classification (ADC) is becoming a big challenge because of the unstructured nature of text feature, which is called “multi-dimension feature problem”. On the other hand, various powerful schemes dealing with text feature selection are being developed continuously nowadays, but there still exists a research gap for “optimization of feature selection problem (OFSP)”, which can be looked for the global optimal features. Meanwhile, the capacity of meta-heuristic intelligence for knowledge discovery process (KDP) is also become the critical role to overcome NP-hard problem of OFSP by providing effective performance and efficient computation time. Therefore, the idea of meta-heuristic based approach for optimization of feature selection is proposed in this research to search the global optimal features for ADC. In this thesis, case study of meta-heuristic intelligence and traditional approaches for feature selection optimization process in document classification is observed. It includes eleven meta-heuristic algorithms such as Ant Colony search, Artificial Bee Colony search, Bat search, Cuckoo search, Evolutionary search, Elephant search, Firefly search, Flower search, Genetic search, Rhinoceros search, and Wolf search, for searching the optimal feature subset for document classification. Then, the results of proposed model are compared with three traditional search algorithms like Best First search (BFS), Greedy Stepwise (GS), and Ranker search (RS). In addition, the framework of data mining is applied. It involves data preprocessing, feature engineering, building learning model and evaluating the performance of proposed meta-heuristic intelligence-based feature selection using various performance and computation complexity evaluation schemes. In data processing, tokenization, stop-words handling, stemming and lemmatizing, and normalization are applied. In feature engineering process, n-gram TF-IDF feature extraction is used for implementing feature vector and both filter and wrapper approach are applied for observing different cases. In addition, three different classifiers like J48, Naïve Bayes, and Support Vector Machine, are used for building the document classification model. According to the results, the proposed system can reduce the number of selected features dramatically that can deteriorate learning model performance. In addition, the selected global subset features can yield better performance than traditional search according to single objective function of proposed model

    Improvements on the bees algorithm for continuous optimisation problems

    Get PDF
    This work focuses on the improvements of the Bees Algorithm in order to enhance the algorithm’s performance especially in terms of convergence rate. For the first enhancement, a pseudo-gradient Bees Algorithm (PG-BA) compares the fitness as well as the position of previous and current bees so that the best bees in each patch are appropriately guided towards a better search direction after each consecutive cycle. This method eliminates the need to differentiate the objective function which is unlike the typical gradient search method. The improved algorithm is subjected to several numerical benchmark test functions as well as the training of neural network. The results from the experiments are then compared to the standard variant of the Bees Algorithm and other swarm intelligence procedures. The data analysis generally confirmed that the PG-BA is effective at speeding up the convergence time to optimum. Next, an approach to avoid the formation of overlapping patches is proposed. The Patch Overlap Avoidance Bees Algorithm (POA-BA) is designed to avoid redundancy in search area especially if the site is deemed unprofitable. This method is quite similar to Tabu Search (TS) with the POA-BA forbids the exact exploitation of previously visited solutions along with their corresponding neighbourhood. Patches are not allowed to intersect not just in the next generation but also in the current cycle. This reduces the number of patches materialise in the same peak (maximisation) or valley (minimisation) which ensures a thorough search of the problem landscape as bees are distributed around the scaled down area. The same benchmark problems as PG-BA were applied against this modified strategy to a reasonable success. Finally, the Bees Algorithm is revised to have the capability of locating all of the global optimum as well as the substantial local peaks in a single run. These multi-solutions of comparable fitness offers some alternatives for the decision makers to choose from. The patches are formed only if the bees are the fittest from different peaks by using a hill-valley mechanism in this so called Extended Bees Algorithm (EBA). This permits the maintenance of diversified solutions throughout the search process in addition to minimising the chances of getting trap. This version is proven beneficial when tested with numerous multimodal optimisation problems
    corecore