Search CORE

482 research outputs found

A Multi-Objective Deep Reinforcement Learning Framework

Author: Dazeley Richard
Lim Chee Peng
Nahavandi Saeid
Nguyen Ngoc Duy
Nguyen Thanh Thi
Vamplew Peter
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We develop a high-performance MODRL framework that supports both single-policy and multi-policy strategies, as well as both linear and non-linear approaches to action selection. The experimental results on two benchmark problems (two-objective deep sea treasure environment and three-objective Mountain Car problem) indicate that the proposed framework is able to find the Pareto-optimal solutions effectively. The proposed framework is generic and highly modularized, which allows the integration of different deep reinforcement learning algorithms in different complex problem domains. This therefore overcomes many disadvantages involved with standard multi-objective reinforcement learning methods in the current literature. The proposed framework acts as a testbed platform that accelerates the development of MODRL for solving increasingly complicated multi-objective problems.Comment: 21 page

arXiv.org e-Print Archive

Federation ResearchOnline

Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation

Author: Parisi Simone
Pirotta Matteo
Restelli Marcello
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2016
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning

Author: Brys Tim
Chandra Arjun
Esterle Lukas
Lewis Peter R.
Nowé Ann
van Moffaert Kristof
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

To solve multi-objective problems, multiple reward signals are often scalarized into a single value and further processed using established single-objective problem solving techniques. While the field of multi-objective optimization has made many advances in applying scalarization techniques to obtain good solution trade-offs, the utility of applying these techniques in the multi-objective multi-agent learning domain has not yet been thoroughly investigated. Agents learn the value of their decisions by linearly scalarizing their reward signals at the local level, while acceptable system wide behaviour results. However, the non-linear relationship between weighting parameters of the scalarization function and the learned policy makes the discovery of system wide trade-offs time consuming. Our first contribution is a thorough analysis of well known scalarization schemes within the multi-objective multi-agent reinforcement learning setup. The analysed approaches intelligently explore the weight-space in order to find a wider range of system trade-offs. In our second contribution, we propose a novel adaptive weight algorithm which interacts with the underlying local multi-objective solvers and allows for a better coverage of the Pareto front. Our third contribution is the experimental validation of our approach by learning bi-objective policies in self-organising smart camera networks. We note that our algorithm (i) explores the objective space faster on many problem instances, (ii) obtained solutions that exhibit a larger hypervolume, while (iii) acquiring a greater spread in the objective space

Aston Publications Explorer

Planning as Optimization: Dynamically Discovering Optimal Configurations for Runtime Situations

Author: Fredericks Erik M.
Gerostathopoulos Ilias
Krupitzer Christian
Vogel Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2019
Field of study

The large number of possible configurations of modern software-based systems, combined with the large number of possible environmental situations of such systems, prohibits enumerating all adaptation options at design time and necessitates planning at run time to dynamically identify an appropriate configuration for a situation. While numerous planning techniques exist, they typically assume a detailed state-based model of the system and that the situations that warrant adaptations are known. Both of these assumptions can be violated in complex, real-world systems. As a result, adaptation planning must rely on simple models that capture what can be changed (input parameters) and observed in the system and environment (output and context parameters). We therefore propose planning as optimization: the use of optimization strategies to discover optimal system configurations at runtime for each distinct situation that is also dynamically identified at runtime. We apply our approach to CrowdNav, an open-source traffic routing system with the characteristics of a real-world system. We identify situations via clustering and conduct an empirical study that compares Bayesian optimization and two types of evolutionary optimization (NSGA-II and novelty search) in CrowdNav

arXiv.org e-Print Archive

Crossref

A Practical Guide to Multi-Objective Reinforcement Learning and Planning

Author: Bargiacchi Eugenio
Dazeley Richard
Hayes Conor F.
Heintz Fredrik
Howley Enda
Irissappane Athirai A.
Källström Johan
Macfarlane Matthew
Mannion Patrick
Nowé Ann
Ramos Gabriel
Restelli Marcello
Reymond Mathieu
Roijers Diederik M.
Rădulescu Roxana
Vamplew Peter
Verstraeten Timothy
Zintgraf Luisa M.
Publication venue
Publication date: 17/03/2021
Field of study

Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Deakin Research Online

Federation ResearchOnline

Digitala Vetenskapliga Arkivet - Academic Archive On-line

A practical guide to multi-objective reinforcement learning and planning

Author: Bargiacchi Eugenio
Dazeley Richard
Hayes Conor
Heintz Frederick
Howley Enda
Irissappane Athirai
Källström Johan
Macfarlane Matthew
Mannion Patrick
Nowé Ann
Ramos Gabriel
Restelli Marcello
Reymond Mathieu
Roijers Diederik
Rădulescu Roxana
Vamplew Peter
Verstraeten Timothy
Zintgraf Luisa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems. © 2022, The Author(s)

Federation ResearchOnline

A learning automata based multiobjective hyper-heuristic

Author: John Robert
Li Wenwen
Özcan Ender
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/12/2017
Field of study

Metaheuristics, being tailored to each particular domain by experts, have been successfully applied to many computationally hard optimisation problems. However, once implemented, their application to a new problem domain or a slight change in the problem description would often require additional expert intervention. There is a growing number of studies on reusable cross-domain search methodologies, such as, selection hyper-heuristics, which are applicable to problem instances from various domains, requiring minimal expert intervention or even none. This study introduces a new learning automata based selection hyper-heuristic controlling a set of multiobjective metaheuristics. The approach operates above three well-known multiobjective evolutionary algorithms and mixes them, exploiting the strengths of each algorithm. The performance and behaviour of two variants of the proposed selection hyper-heuristic, each utilising a different initialisation scheme are investigated across a range of unconstrained multiobjective mathematical benchmark functions from two different sets and the realworld problem of vehicle crashworthiness. The empirical results illustrate the effectiveness of our approach for cross-domain search, regardless of the initialisation scheme, on those problems when compared to each individual multiobjective algorithm. Moreover, both variants perform signicantly better than some previously proposed selection hyper-heuristics for multiobjective optimisation, thus signicantly enhancing the opportunities for improved multiobjective optimisation

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

Pessimistic Off-Policy Multi-Objective Optimization

Author: Alizadeh Shima
Bhargava Aniruddha
Gopalswamy Karthick
Jain Lalit
Kveton Branislav
Liu Ge
Publication venue
Publication date: 28/10/2023
Field of study

Multi-objective optimization is a type of decision making problems where multiple conflicting objectives are optimized. We study offline optimization of multi-objective policies from data collected by an existing policy. We propose a pessimistic estimator for the multi-objective policy values that can be easily plugged into existing formulas for hypervolume computation and optimized. The estimator is based on inverse propensity scores (IPS), and improves upon a naive IPS estimator in both theory and experiments. Our analysis is general, and applies beyond our IPS estimators and methods for optimizing them. The pessimistic estimator can be optimized by policy gradients and performs well in all of our experiments

arXiv.org e-Print Archive