Search CORE

39,402 research outputs found

Deep Reinforcement Learning for Adaptive Parameter Control in Differential Evolution for Multi-Objective Optimization

Author: Bukhsh Zaharah
Guzek Mateusz
Reijnen Robbert
Zhang Yingqian
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 30/01/2023
Field of study

Evolutionary algorithms (EA) are efficient population-based stochastic algorithms for solving optimization problems. The performance of EAs largely depends on the configuration of values of parameters that control their search. Previous works studied how to configure EAs, though, there is a lack of a general approach to effectively tune EAs. To fill this gap, this paper presents a consistent, automated approach for tuning and controlling parameterized search of an EA. For this, we propose a deep reinforcement learning (DRL) based approach called ‘DRL-APC-DE’ for online controlling search parameter values for a multi-objective Differential Evolution algorithm. The proposed method is trained and evaluated on widely adopted multi-objective test problems. The experimental results show that the proposed approach performs competitively to a non-adaptive Differential Evolution algorithm, tuned by grid search on the same range of possible parameter values. Subsequently, the trained algorithms have been applied to unseen multi-objective problems for the adaptive control of parameters. Results show the successful ability of DRL-APC-DE to control parameters for solving these problems, which has the potential to significantly reduce the dependency on parameter tuning for the successful application of EAs

Pure OAI Repository

Efficient Meta Neural Heuristic for Multi-Objective Combinatorial Optimization

Author: Cao Zhiguang
Chen Jinbiao
Chen Siyuan
Wang Jiahai
Ye Te
Zhang Zizhen
Publication venue
Publication date: 22/10/2023
Field of study

Recently, neural heuristics based on deep reinforcement learning have exhibited promise in solving multi-objective combinatorial optimization problems (MOCOPs). However, they are still struggling to achieve high learning efficiency and solution quality. To tackle this issue, we propose an efficient meta neural heuristic (EMNH), in which a meta-model is first trained and then fine-tuned with a few steps to solve corresponding single-objective subproblems. Specifically, for the training process, a (partial) architecture-shared multi-task model is leveraged to achieve parallel learning for the meta-model, so as to speed up the training; meanwhile, a scaled symmetric sampling method with respect to the weight vectors is designed to stabilize the training. For the fine-tuning process, an efficient hierarchical method is proposed to systematically tackle all the subproblems. Experimental results on the multi-objective traveling salesman problem (MOTSP), multi-objective capacitated vehicle routing problem (MOCVRP), and multi-objective knapsack problem (MOKP) show that, EMNH is able to outperform the state-of-the-art neural heuristics in terms of solution quality and learning efficiency, and yield competitive solutions to the strong traditional heuristics while consuming much shorter time.Comment: Accepted at NeurIPS 202

arXiv.org e-Print Archive

Enhancing Exploration and Safety in Deep Reinforcement Learning

Author: Enrico Marchesini
Publication venue
Publication date: 01/01/2022
Field of study

A Deep Reinforcement Learning (DRL) agent tries to learn a policy maximizing a long-term objective by trials and errors in large state spaces. However, this learning paradigm requires a non-trivial amount of interactions in the environment to achieve good performance. Moreover, critical applications, such as robotics, typically involve safety criteria to consider while designing novel DRL solutions. Hence, devising safe learning approaches with efficient exploration is crucial to avoid getting stuck in local optima, failing to learn properly, or causing damages to the surrounding environment. This thesis focuses on developing Deep Reinforcement Learning algorithms to foster efficient exploration and safer behaviors in simulation and real domains of interest, ranging from robotics to multi-agent systems. To this end, we rely both on standard benchmarks, such as SafetyGym, and robotic tasks widely adopted in the literature (e.g., manipulation, navigation). This variety of problems is crucial to assess the statistical significance of our empirical studies and the generalization skills of our approaches. We initially benchmark the sample efficiency versus performance trade-off between value-based and policy-gradient algorithms. This part highlights the benefits of using non-standard simulation environments (i.e., Unity), which also facilitates the development of further optimization for DRL. We also discuss the limitations of standard evaluation metrics (e.g., return) in characterizing the actual behaviors of a policy, proposing the use of Formal Verification (FV) as a practical methodology to evaluate behaviors over desired specifications. The second part introduces Evolutionary Algorithms (EAs) as a gradient-free complimentary optimization strategy. In detail, we combine population-based and gradient-based DRL to diversify exploration and improve performance both in single and multi-agent applications. For the latter, we discuss how prior Multi-Agent (Deep) Reinforcement Learning (MARL) approaches hinder exploration, proposing an architecture that favors cooperation without affecting exploration

Catalogo dei prodotti della ricerca

Pareto multi-task deep learning

Author: D Dyankov
D Silver
DE Rumelhart
G Stracquadanio
K De Jong
K Stanley
O Vinyals
V Mnih
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Neuroevolution has been used to train Deep Neural Networks on reinforcement learning problems. A few attempts have been made to extend it to address either multi-task or multi-objective optimization problems. This research work presents the Multi-Task Multi-Objective Deep Neuroevolution method, a highly parallelizable algorithm that can be adopted for tackling both multi-task and multi-objective problems. In this method prior knowledge on the tasks is used to explicitly define multiple utility functions, which are optimized simultaneously. Experimental results on some Atari 2600 games, a challenging testbed for deep reinforcement learning algorithms, show that a single neural network with a single set of parameters can outperform previous state of the art techniques. In addition to the standard analysis, all results are also evaluated using the Hypervolume indicator and the Kullback-Leibler divergence to get better insights on the underlying training dynamics. The experimental results show that a neural network trained with the proposed evolution strategy can outperform networks individually trained respectively on each of the tasks

Central Archive at the University of Reading

Crossref

Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES

Author: Bjerrum Esben Jannik
Blaschke Thomas
de Castro Raquel López-Ríos
Margreitter Christian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/03/2023
Field of study

Using generative deep learning models and reinforcement learning together can effectively generate new molecules with desired properties. By employing a multi-objective scoring function, thousands of high-scoring molecules can be generated, making this approach useful for drug discovery and material science. However, the application of these methods can be hindered by computationally expensive or time-consuming scoring procedures, particularly when a large number of function calls are required as feedback in the reinforcement learning optimization. Here, we propose the use of double-loop reinforcement learning with simplified molecular line entry system (SMILES) augmentation to improve the efficiency and speed of the optimization. By adding an inner loop that augments the generated SMILES strings to non-canonical SMILES for use in additional reinforcement learning rounds, we can both reuse the scoring calculations on the molecular level, thereby speeding up the learning process, as well as offer additional protection against mode collapse. We find that employing between 5 and 10 augmentation repetitions is optimal for the scoring functions tested and is further associated with an increased diversity in the generated compounds, improved reproducibility of the sampling runs and the generation of molecules of higher similarity to known ligands.Comment: 25 pages and 18 Figures. Supplementary material include

arXiv.org e-Print Archive

Learning Adaptive Evolutionary Computation for Solving Multi-Objective Optimization Problems

Author: Bliek Laurens
Coppens Remco H.M.
Reijnen Robbert
Steenhuisen Berend
Zhang Yingqian
Publication venue
Publication date: 01/11/2022
Field of study

Multi-objective evolutionary algorithms (MOEAs) are widely used to solve multi-objective optimization problems. The algorithms rely on setting appropriate parameters to find good solutions. However, this parameter tuning could be very computationally expensive in solving non-trial (combinatorial) optimization problems. This paper proposes a framework that integrates MOEAs with adaptive parameter control using Deep Reinforcement Learning (DRL). The DRL policy is trained to adaptively set the values that dictate the intensity and probability of mutation for solutions during optimization. We test the proposed approach with a simple benchmark problem and a real-world, complex warehouse design and control problem. The experimental results demonstrate the advantages of our method in terms of solution quality and computation time to reach good solutions. In addition, we show the learned policy is transferable, i.e., the policy trained on a simple benchmark problem can be directly applied to solve the complex warehouse optimization problem, effectively, without the need for retraining

Pure OAI Repository