96,357 research outputs found
Dynamic Weights in Multi-Objective Deep Reinforcement Learning
Many real-world decision problems are characterized by multiple conflicting
objectives which must be balanced based on their relative importance. In the
dynamic weights setting the relative importance changes over time and
specialized algorithms that deal with such change, such as a tabular
Reinforcement Learning (RL) algorithm by Natarajan and Tadepalli (2005), are
required. However, this earlier work is not feasible for RL settings that
necessitate the use of function approximators. We generalize across weight
changes and high-dimensional inputs by proposing a multi-objective Q-network
whose outputs are conditioned on the relative importance of objectives and we
introduce Diverse Experience Replay (DER) to counter the inherent
non-stationarity of the Dynamic Weights setting. We perform an extensive
experimental evaluation and compare our methods to adapted algorithms from Deep
Multi-Task/Multi-Objective Reinforcement Learning and show that our proposed
network in combination with DER dominates these adapted algorithms across
weight change scenarios and problem domains
A Multi-Objective Deep Reinforcement Learning Framework
This paper introduces a new scalable multi-objective deep reinforcement
learning (MODRL) framework based on deep Q-networks. We develop a
high-performance MODRL framework that supports both single-policy and
multi-policy strategies, as well as both linear and non-linear approaches to
action selection. The experimental results on two benchmark problems
(two-objective deep sea treasure environment and three-objective Mountain Car
problem) indicate that the proposed framework is able to find the
Pareto-optimal solutions effectively. The proposed framework is generic and
highly modularized, which allows the integration of different deep
reinforcement learning algorithms in different complex problem domains. This
therefore overcomes many disadvantages involved with standard multi-objective
reinforcement learning methods in the current literature. The proposed
framework acts as a testbed platform that accelerates the development of MODRL
for solving increasingly complicated multi-objective problems.Comment: 21 page
Autonomous Driving: A Multi-Objective Deep Reinforcement Learning Approach
Autonomous driving is a challenging domain that entails multiple aspects: a vehicle should be able to drive to its destination as fast as possible while avoiding collision, obeying traffic rules and ensuring the comfort of passengers. It's representative of complex reinforcement learning tasks humans encounter in real life. The aim of this thesis is to explore the effectiveness of multi-objective reinforcement learning for such tasks characterized by autonomous driving. In particular, it shows that:
1. Multi-objective reinforcement learning is effective at overcoming some of the difficulties faced by scalar-reward reinforcement learning, and a multi-objective DQN agent based on a variant of thresholded lexicographic Q-learning is successfully trained to drive on multi-lane roads and intersections, yielding and changing lanes according to traffic rules.
2. Data efficiency of (multi-objective) reinforcement learning can be significantly improved by exploiting the factored structure of a task. Specifically, factored Q functions learned on the factored state space can be used as features to the original Q function to speed up learning.
3. Inclusion of history-dependent policies enables an intuitive exact algorithm for multi-objective reinforcement learning with thresholded lexicographic order
Deep W-Networks: Solving Multi-Objective Optimisation Problems With Deep Reinforcement Learning
In this paper, we build on advances introduced by the Deep Q-Networks (DQN)
approach to extend the multi-objective tabular Reinforcement Learning (RL)
algorithm W-learning to large state spaces. W-learning algorithm can naturally
solve the competition between multiple single policies in multi-objective
environments. However, the tabular version does not scale well to environments
with large state spaces. To address this issue, we replace underlying Q-tables
with DQN, and propose an addition of W-Networks, as a replacement for tabular
weights (W) representations. We evaluate the resulting Deep W-Networks (DWN)
approach in two widely-accepted multi-objective RL benchmarks: deep sea
treasure and multi-objective mountain car. We show that DWN solves the
competition between multiple policies while outperforming the baseline in the
form of a DQN solution. Additionally, we demonstrate that the proposed
algorithm can find the Pareto front in both tested environments
- …