7 research outputs found
Optimistic Variants of Single-Objective Bilevel Optimization for Evolutionary Algorithms
Single-objective bilevel optimization is a specialized form of constraint optimization problems where one of the constraints is an optimization problem itself. These problems are typically non-convex and strongly NP-Hard. Recently, there has been an increased interest from the evolutionary computation community to model bilevel problems due to its applicability in real-world applications for decision-making problems. In this work, a partial nested evolutionary approach with a local heuristic search has been proposed to solve the benchmark problems and have outstanding results. This approach relies on the concept of intermarriage-crossover in search of feasible regions by exploiting information from the constraints. A new variant has also been proposed to the commonly used convergence approaches, i.e., optimistic and pessimistic. It is called an extreme optimistic approach. The experimental results demonstrate the algorithm converges differently to known optimum solutions with the optimistic variants. Optimistic approach also outperforms pessimistic approach. Comparative statistical analysis of our approach with other recently published partial to complete evolutionary approaches demonstrates very competitive results
Bi-level Actor-Critic for Multi-agent Coordination
Coordination is one of the essential problems in multi-agent systems.
Typically multi-agent reinforcement learning (MARL) methods treat agents
equally and the goal is to solve the Markov game to an arbitrary Nash
equilibrium (NE) when multiple equilibra exist, thus lacking a solution for NE
selection. In this paper, we treat agents \emph{unequally} and consider
Stackelberg equilibrium as a potentially better convergence point than Nash
equilibrium in terms of Pareto superiority, especially in cooperative
environments. Under Markov games, we formally define the bi-level reinforcement
learning problem in finding Stackelberg equilibrium. We propose a novel
bi-level actor-critic learning method that allows agents to have different
knowledge base (thus intelligent), while their actions still can be executed
simultaneously and distributedly. The convergence proof is given, while the
resulting learning algorithm is tested against the state of the arts. We found
that the proposed bi-level actor-critic algorithm successfully converged to the
Stackelberg equilibria in matrix games and find an asymmetric solution in a
highway merge environment
An efficient hybrid differential evolutionary algorithm for zbilevel optimisation problems
Bilevel problems are widely used to describe the decision problems with hierarchical upperālower-level structures in many economic fields. The bilevel optimisation problem (BLOP) is intrinsically NP-hard when its objectives and constraints are complex and the decision variables are large in scale at both levels. An efficient hybrid differential evolutionary algorithm for BLOP (HDEAB) is proposed where the optimal lower level value function mapping method, the differential evolutionary algorithm, k-near- est neighbours (KNN) and a nested local search are hybridised to improve the computational accuracy and efficiency. To show the performance of the HDEAB, numerical studies were conducted on SMD (Sinha, Maro and Deb) instances and an application example of optimising a venture capital staged-financing contract. The results demonstrate that the HDEAB outperforms the BLEAQ (bile- vel evolutionary algorithm based on quadratic approximations) greatly in solving the BLOPs with different scale
Evolutionary multi-objective worst-case robust optimisation
Many real-world problems are subject to uncertainty, and often solutions should not only be good, but also robust against environmental disturbances or deviations from the decision variables. While most papers dealing with robustness aim at finding solutions with a high expected performance given a distribution of the uncertainty, we examine the trade-off between the allowed deviations from the decision variables (tolerance level), and the worst case performance given the allowed deviations. In this research work, we suggest two multi-objective evolutionary algorithms to compute the available trade-offs between allowed tolerance level and worst-case quality of the solutions, and the tolerance level is defined as robustness which could also be the variations from parameters. Both algorithms are 2-level nested algorithms. While the first algorithm is point-based in the sense that the lower level computes a point of worst case for each upper level solution, the second algorithm is envelope-based, in the sense that the lower level computes a whole trade-off curve between worst-case fitness and tolerance level for each upper level solution.
Our problem can be considered as a special case of bi-level optimisation, which is computationally expensive, because each upper level solution is evaluated by calling a lower level optimiser. We propose and compare several strategies to improve the efficiency of both algorithms. Later, we also suggest surrogate-assisted algorithms to accelerate both algorithms