26 research outputs found

    Trust-region based adaptive radial basis function algorithm for global optimization of expensive constrained black-box problems

    Get PDF
    It has been a very challenging task to develop efficient and robust techniques to solve real-world engineering optimization problems due to the unknown function properties, complex constraints and severely limited computational budget. To address this issue, TARBF algorithm (trust-region based adaptive radial basis function interpolation) for solving expensive constrained black-box optimization problems is proposed in this paper. The approach successfully decomposes the original optimization problem into a sequence of sub-problems approximated by radial basis functions in a series of trust regions. Then, the solution of each sub-problem becomes the starting point for the next iteration. According to the values of objective and constraint functions, an effective online normalization technique is further developed to adaptively improve the model accuracy in the trust region, where the surrogate is updated iteratively. Averagely, TARBF has the ability to robustly solve the 21 G-problems (CEC’2006) and 4 engineering problems within 535:69 and 234:44 function evaluations, respectively. The comparison results with other state-of-the-art metamodel-based algorithms prove that TARBF is a convergent, efficient and accurate paradigm. Moreover, the sophisticated trust region strategy developed in TARBF, which is a major contribution to the field of the efficient constrained optimization, has the capability to facilitate an effective balance of exploration and exploitation for solving constrained black-box optimization problems

    Bat Q-learning Algorithm

    No full text
    Cooperative Q-learning approach allows multiple learners to learn independently then share their Q-values among each other using a Q-value sharing strategy. A main problem with this approach is that the solutions of the learners may not converge to optimality because the optimal Q-values may not be found. Another problem is that some cooperative algorithms perform very well with single-task problems, but quite poorly with multi-task problems. This paper proposes a new cooperative Q-learning algorithm called the Bat Q-learning algorithm (BQ-learning) that implements a Q-value sharing strategy based on the Bat algorithm. The Bat algorithm is a powerful optimization algorithm that increases the possibility of finding the optimal Q-values by balancing between the exploration and exploitation of actions by tuning the parameters of the algorithm. The BQ-learning algorithm was tested using two problems: the shortest path problem (single-task problem) and the taxi problem (multi-task problem). The experimental results suggest that BQ-learning performs better than single-agent Q-learning and some well-known cooperative Q-learning algorithms

    Cooperative reinforcement learning for independent learners

    No full text
    Research Doctorate - Doctor of Philosophy (PhD)Machine learning in multi-agent domains poses several research challenges. One challenge is how to model cooperation between reinforcement learners. Cooperation between independent reinforcement learners is known to accelerate convergence to optimal solutions. In large state space problems, independent reinforcement learners normally cooperate to accelerate the learning process using decomposition techniques or knowledge sharing strategies. This thesis presents two techniques to multi-agent reinforcement learning and a comparison study. The first technique is a formal decomposition model and an algorithm for distributed systems. The second technique is a cooperative Q-learning algorithm for multi-goal decomposable systems. The comparison study compares the performance of some of the best known cooperative Q-learning algorithms for independent learners. Distributed systems are normally organised into two levels: system and subsystem levels. This thesis presents a formal solution for decomposition of Markov Decision Processes (MDPs) in distributed systems that takes advantage of the organisation of distributed systems and provides support for migration of learners. This is accomplished by two proposals: a Distributed, Hierarchical Learning Model (DHLM) and an Intelligent Distributed Q-Learning algorithm (IDQL) that are based on three specialisations of agents: workers, tutors and consultants. Worker agents are the actual learners and performers of tasks, while tutor agents and consultant agents are coordinators at the subsystem level and the system level, respectively. A main duty of consultant and tutor agents is the assignment of problem space to worker agents. The experimental results in a distributed hunter prey problem suggest that IDQL converges to a solution faster than the single agent Q-learning approach. An important feature of DHLM is that it provides a solution for migration of agents. This feature provides support for the IDQL algorithm where the problem space of each worker agent can change dynamically. Other hierarchical RL models do not cover this issue. Problems that have multiple goal-states can be decomposed into sub-problems by taking advantage of the loosely-coupled bonds among the goal states. In such problems, each goal state and its problem space form a sub-problem. This thesis introduces Q-learning with Aggregation algorithm (QA-learning), an algorithm for problems with multiple goal-states that is based on two roles: learner and tutor. A learner is an agent that learns and uses the knowledge of its neighbours (tutors) to construct its Q-table. A tutor is a learner that is ready to share its Q-table with its neighbours (learners). These roles are based on the concept of learners reusing tutors' sub-solutions. This algorithm provides solutions to problems with multiple goal-states. In this algorithm, each learner incorporates its tutors' knowledge into its own Q-table calculations. A comprehensive solution can then be obtained by combining these partial solutions. The experimental results in an instance of the shortest path problem suggest that the output of QA-learning is comparable to the output of a single Q-learner whose problem space is the whole system. But the QA-learning algorithm converges to a solution faster than a single learner approach. Cooperative Q-learning algorithms for independent learners accelerate the learning process of individual learners. In this type of Q-learning, independent learners share and update their Q-values by following a sharing strategy after some episodes learning independently. This thesis presents a comparison study of the performance of some famous cooperative Q-learning algorithms (BEST-Q, AVE-Q, PSO-Q, and WSS) as well as an algorithm that aggregates their results. These algorithms are compared in two cases: equal experience and different experiences cases. In the first case, the learners have equal learning time, while in the second case, the learners have different learning times. The comparison study also examines the effects of the frequency of Q-value sharing on the learning speed of independent learners. The experimental results in the equal experience case indicate that sharing of Q-values is not beneficial and produces similar results to single agent Q-learning. While, the experimental results in the different experiences case suggest that each of the cooperative Q-learning algorithms performs similarly, but better than single agent Q-learning. In both cases, high-frequency sharing of Q-values accelerates the convergence to optimal solutions compared to low-frequency sharing. Low-frequency Q-value sharing degrades the performance of the cooperative Q-learning algorithms in the equal experience and different experiences cases

    DISTRIBUTED GREY WOLF OPTIMIZER FOR NUMERICAL OPTIMIZATION PROBLEMS

    No full text
    The Grey Wolf Optimizer (GWO) algorithm is an interesting swarm-based optimization algorithm for global optimization. It was inspired by the hunting strategy and leadership hierarchy of grey wolves. The GWO algorithm has been successfully tailored to solve various continuous and discrete optimization problems. However, the main drawback of GWO is that it may converge to sub-optimal solutions in early stages of its simulation process due to the loss of diversity in its population. This paper introduces a distributed variation of GWO(DGWO) that attempts to enhance the diversity of GWO by organizing its population into small independent groups (islands) based on a well-known distributed model called the island model. DGWO applies the original GWO to each island and then allows selected solutions to be exchanged among the islands based on the random ring topology and the best-worst migration policy. The island model in DGWO provides a better environment for unfit candidate solutions in each island to evolve into better solutions, which increases the likelihood of finding global optimal solutions. Another interesting feature about DGWO is that it can run in parallel devices, which means that its computational complexity can be reduced compared to the computational complexity of existing variations of GWO. DGWO was evaluated and compared to well-known swarm-based optimization algorithms using 30 CEC 2014 functions. In addition, the sensitivity of DGWO to its parameters was evaluated using 15 standard test functions. The comparative study and the sensitivity analysis for DGWO indicate that it provides competitive performance compared to the other tested algorithms. The source code of DGWO is available at: https://www.dropbox.com/s/2d16t46598u03y0/DistributedGreyWolfOptimizer.zip?dl=

    Island-based Cuckoo Search with elite opposition-based learning and multiple mutation methods for solving optimization problems

    No full text
    The island Cuckoo Search (iCSPM) algorithm is a variation of Cuckoo Search that uses the island model and highly disruptive polynomial mutation to solve optimization problems. This article introduces an improved iCSPM algorithm called iCSPM with elite opposition-based learning and multiple mutation methods (iCSPM2). iCSPM2 has three main characteristics. Firstly, it separates candidate solutions into several islands (sub-populations) and then divides the islands among four improved Cuckoo Search algorithms: Cuckoo Search via Lévy flights, Cuckoo Search with highly disruptive polynomial mutation, Cuckoo Search with Jaya mutation and Cuckoo Search with pitch adjustment mutation. Secondly, it uses elite opposition-based learning to improve its convergence rate and exploration ability. Finally, it makes continuous candidate solutions discrete using the smallest position value method. A set of 15 popular benchmark functions indicate iCSPM2 performs better than iCSPM. However, based on sensitivity analysis of both algorithms, convergence behavior seems sensitive to island model parameters. Further, the single-objective IEEE-CEC 2014 functions were used to evaluate and compare the performance of iCSPM2 to four well-known swarm optimization algorithms: distributed grey wolf optimizer, distributed adaptive differential evolution with linear population size reduction evolution, memory-based hybrid dragonfly algorithm and fireworks algorithm with differential mutation. Experimental and statistical results suggest iCSPM2 has better performance than the four other algorithms. iCSPM2's performance was also shown to be favorable compared to two powerful discrete optimization algorithms (generalized accelerations for insertion-based heuristics and memetic algorithm with novel semi-constructive crossover and mutation operators) using a set of Taillard's benchmark instances for the permutation flow shop scheduling problem

    Hybridizing the Cuckoo Search Algorithm with Different Mutation Operators for Numerical Optimization Problems

    No full text
    The Cuckoo search (CS) algorithm is an efficient evolutionary algorithm inspired by the nesting and parasitic reproduction behaviors of some cuckoo species. Mutation is an operator used in evolutionary algorithms to maintain the diversity of the population from one generation to the next. The original CS algorithm uses the Lévy flight method, which is a special mutation operator, for efficient exploration of the search space. The major goal of the current paper is to experimentally evaluate the performance of the CS algorithm after replacing the Lévy flight method in the original CS algorithm with seven different mutation methods. The proposed variations of CS were evaluated using 14 standard benchmark functions in terms of the accuracy and reliability of the obtained results over multiple simulations. The experimental results suggest that the CS with polynomial mutation provides more accurate results and is more reliable than the other CS variations

    Improved Salp swarm algorithm for solving single-objective continuous optimization problems

    No full text
    The Salp Swarm Algorithm (SSA) is an effective single-objective optimization algorithm that was inspired by the navigating and foraging behaviors of salps in their natural habitats. Although SSA was successfully tailored and applied to solve various types of optimization problems, it often suffers from premature convergence and typically does not perform well with high-dimensional optimization problems. This paper introduces an Improved SSA (ISSA) algorithm to enhance the performance of SSA in solving single-objective continuous optimization problems. ISSA has four characteristics. First, it employs Gaussian Perturbation to improve the diversity of initial population. Second, it uses highly disruptive polynomial mutation (HDPM) to update the leader salp in the salp chain. Third, it uses the Laplace crossover operator to improve its exploration ability. Fourth, it uses a new opposition learning method called Mixed Opposition-based Learning (MOBL) to improve its convergence rate and exploration ability. A set of 14 standard benchmark functions was used to evaluate the performance of ISSA and compare it to three variations of SSA (SSA, Hybrid SSA with Particle Swarm Optimization HSSAPSO Singh et al. (2020) and Enhanced SSA (ESSA) Zhang et al. (2020)). The overall experimental and statistical results indicate that ISSA is a better optimization algorithm than the other SSA variations. Further, the single-objective IEEE CEC 2014 (IEEE Congress on Evolutionary Computation 2014) functions were used to evaluate and compare the performance of ISSA to 18 well-known and state-of-the-art optimization algorithms (Exploratory Cuckoo Search (ECS) Abed-alguni (2021)), Grey Wolf Optimizer (GWO) Mirjalili and Mirjalili (Advances in Engineering Software, 69, 46–61, 2014), Distributed Grey Wolf Optimize (DGWO) Abed-alguni and Barhoush (2018), Cuckoo Search (CS) Yang and Deb (2009), Distributed adaptive differential evolution with linear population size reduction evolution (L-SHADE) Tanabe and Fukunaga (2014), Memory-based Hybrid Dragonfly Algorithm (MHDA) KS and Murugan (Expert Syst Appl, 83, 63–78, 2017), Fireworks Algorithm with Differential Mutation (FWA-DM) Yu et al. (2014), Differential Evolution-based Salp Swarm Algorithm (DESSA) Dhabal et al. (Soft Comput, 25(3), 1941–1961, 2021), LSHADE with Fitness and Diversity Ranking-Based Mutation Operator (FD-LSHADE) Cheng et al. (Swarm and Evolutionary Computation, 61, 100816, 2021), Distance based SHADE (Db-SHADE) Viktorin et al. (Swarm and Evolutionary Computation, 50, 100462, 2019) and Zeng et al. (Knowl-Based Syst, 226, 107150, 2021), Mean–Variance Mapping Optimization (MVMO) Iacca et al. (Expert Syst Appl, 165, 113902, 2021), Time-varying strategy-based Differential Evolution (TVDE) Sun et al. (Soft Comput, 24(4), 2727–2747, 2020), Butterfly Optimization Algorithm with adaptive gbest-guided search strategy and Pinhole-Imaging-based Learning (PIL-BOA)Long et al. (Appl Soft Comput, 103, 107146, 2021), Memory Guided Sine Cosine Algorithm (MGSCA) Gupta et al. (Eng Appl Artif Intell, 93, 103718, 2020), Levy flight Jaya Algorithm (LJA) Iacca et al. (2021), Sine Cosine Algorithm (SCA) Dhabala et al. (2021), Covariance Matrix Adaptation Evolution Strategy (CMA-ES) Hansen et al. (Evolutionary Computation, 11(1), 1–18, 2003) and Coyote Optimization Algorithm (COA) Pierezan and Coelho (2018)). The results indicate that ISSA performs better than the tested optimization algorithms
    corecore