Search CORE

85 research outputs found

A practical guide to multi-objective reinforcement learning and planning

Author: Bargiacchi Eugenio
Dazeley Richard
Hayes Conor
Heintz Frederick
Howley Enda
Irissappane Athirai
Källström Johan
Macfarlane Matthew
Mannion Patrick
Nowé Ann
Ramos Gabriel
Restelli Marcello
Reymond Mathieu
Roijers Diederik
Rădulescu Roxana
Vamplew Peter
Verstraeten Timothy
Zintgraf Luisa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems. © 2022, The Author(s)

Federation ResearchOnline

A Practical Guide to Multi-Objective Reinforcement Learning and Planning

Author: Bargiacchi Eugenio
Dazeley Richard
Hayes Conor F.
Heintz Fredrik
Howley Enda
Irissappane Athirai A.
Källström Johan
Macfarlane Matthew
Mannion Patrick
Nowé Ann
Ramos Gabriel
Restelli Marcello
Reymond Mathieu
Roijers Diederik M.
Rădulescu Roxana
Vamplew Peter
Verstraeten Timothy
Zintgraf Luisa M.
Publication venue
Publication date: 17/03/2021
Field of study

Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Deakin Research Online

Federation ResearchOnline

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Fair assignment of indivisible objects under ordinal preferences

Author: Aziz Haris
Gaspers Serge
Mackenzie Simon
Walsh Toby
Publication venue
Publication date: 17/06/2015
Field of study

We consider the discrete assignment problem in which agents express ordinal preferences over objects and these objects are allocated to the agents in a fair manner. We use the stochastic dominance relation between fractional or randomized allocations to systematically define varying notions of proportionality and envy-freeness for discrete assignments. The computational complexity of checking whether a fair assignment exists is studied for these fairness notions. We also characterize the conditions under which a fair assignment is guaranteed to exist. For a number of fairness concepts, polynomial-time algorithms are presented to check whether a fair assignment exists. Our algorithmic results also extend to the case of unequal entitlements of agents. Our NP-hardness result, which holds for several variants of envy-freeness, answers an open question posed by Bouveret, Endriss, and Lang (ECAI 2010). We also propose fairness concepts that always suggest a non-empty set of assignments with meaningful fairness properties. Among these concepts, optimal proportionality and optimal weak proportionality appear to be desirable fairness concepts.Comment: extended version of a paper presented at AAMAS 201

arXiv.org e-Print Archive

CiteSeerX

Towards fairness in Kidney Exchange Programs

Author: St-Arnaud William
Publication venue
Publication date: 01/08/2021
Field of study

Le traitement médical de choix pour la maladie rénale chronique est la transplantation d'organe. Cependant, plusieurs patients ne sont en mesure que de trouver un donneur direct avec lequel ils ne sont pas compatibles. Les Programmes de Don Croisé de Reins peuvent aider plusieurs paires donneur-patient incompatibles à échanger leur donneur entre elles. Typiquement, l'objectif principal d'un tel programme est de maximiser le nombre total de transplantations qui seront effectuées grâce à un plan d'échange. Plusieurs solutions optimales peuvent co-exister et comme la plupart correspondent à différents ensembles de patients obtenant un donneur compatible, il devient important de considérer quels individus seront sélectionnés. Fréquemment, ce problème n'est pas abordé et la première solution fournie par un solveur est choisie comme plan d'échange. Ceci peut mener à des parti-pris en faveur ou défaveur de certains patients, ce qui n'est pas considéré une approche juste. De plus, il est de la responsabilité des informaticiens de s'assurer du contrôle des résultats fournis par leurs algorithmes. Pour répondre à ce besoin, nous explorons l'emploi de multiples solutions optimales ainsi que la manière dont il est possible de sélectionner un plan d'échange parmi celles-ci. Nous proposons l'emploi de politiques aléatoires pour la sélection de solutions optimales suite à leur enumération. Cette tâche est accomplie grâce à la programmation en nombres entiers et à la programmation par contraintes. Nous introduisons aussi un nouveau concept intitulé équité individuelle. Ceci a pour but de trouver une politique juste pouvant être utilisée en collaboration avec les solutions énumerées. La mise à disposition de plusieurs métriques fait partie intégrante de la méthode. En faisant usage de la génération de colonnes en combinaison au métrique

L_1

, nous parvenons à applique la méthode à de plus larges graphes. Lors de l'évaluation de l'équité individuelle, nous analysons de façon systématique d'autres schémas d'équité tels que le principle d'Aristote, la justice Rawlsienne, le principe d'équité de Nash et les valeurs de Shapley. Nous étudions leur description mathématiques ainsi que leurs avantages et désavantages. Finalement, nous soulignons le besoin de considérer de multiples solutions, incluant des solutions non optimales en ce qui concerne le nombre de transplantations d'un plan d'échange. Pour la sélection d'une politique équitable ayant comme domaine un tel ensemble de solutions, nous notons l'importance de trouver un équilibre entre les mesures d'utilité et d'équité d'une solution. Nous utilisons le Programme de Bien-être Social de Nash afin de satisfaire à un tel objectif. Nous proposons aussi une méthodologie de décomposition qui permet d'étendre le système sous-jacent et de faciliter l'énumeration de solutions.The preferred treatment for chronic kidney disease is transplantation. However, many patients can only find direct donors that are not fully compatible with them. Kidney Exchange Programs (KEPs) can help these patients by swapping the donors of multiple patient-donor pairs in order to accommodate them. Usually, the objective is to maximize the total number of transplants that can be realized as part of an exchange plan. Many optimal solutions can co-exist and since a large part of them features different subsets of patients that obtain a compatible donor, the question of who is selected becomes relevant. Often, this problem is not even addressed and the first solution returned by a solver is chosen as the exchange plan to be performed. This can lead to bias against some patients and thus is not considered a fair approach. Moreover, it is of the responsibility of computer scientists to have control of the output of the algorithms they design. To resolve this issue, we explore the use of multiple optimal solutions and how to pick an exchange plan among them. We propose the use of randomized policies for selecting an optimal solution, first by enumerating them. This task is achieved through both integer programming and constraint programming methods. We also introduce a new concept called individual fairness in a bid to find a fair policy over the enumerated solutions by making use of multiple metrics. We scale the method to larger instances by adding column generation as part of the enumeration with the

L_1

metric. When evaluating individual fairness, we systematically review other fairness schemes such as Aristotle's principle, Rawlsian justice, Nash's principle of fairness, and Shapley values. We analyze their mathematical descriptions and their pros and cons. Finally, we motivate the need to consider solutions that are not optimal in the number of transplants. For the selection of a good policy over this larger set of solutions, we motivate the need to balance utility and our individual fairness measure. We use the Nash Social Welfare Program in order to achieve this, and we also propose a decomposition methodology to extend the machinery for an efficient enumeration of solutions

Dépôt Institutionnel Numérique

Measuring Risk In Networks

Author: Quiggin Daniel
Publication venue: ScholarWorks @ Georgia State University
Publication date: 21/11/2019
Field of study

Participation in networks inevitably involves risk. However, the study of networks has, perhaps surprisingly, not had much to say about network risk in the sense that most economists would use the term ‘risk.’ No consensus has even emerged on what such a model would constitute. Network risk appears to be present in the world, whether in the financial sector, in transportation, or with regards to interpersonal connections, and yet we have few tools for modeling it. The primary contribution of this thesis is a formal notion of network risk, and a set of tools for measuring it

ScholarWorks @ Georgia State University

Information-theoretic Reasoning in Distributed and Autonomous Systems

Author: Cliff Oliver
Publication venue: Faculty of Engineering and Information Technologies, School of Aerospace, Mechanical and Mechatronic Engineering
Publication date: 01/01/2019
Field of study

The increasing prevalence of distributed and autonomous systems is transforming decision making in industries as diverse as agriculture, environmental monitoring, and healthcare. Despite significant efforts, challenges remain in robustly planning under uncertainty. In this thesis, we present a number of information-theoretic decision rules for improving the analysis and control of complex adaptive systems. We begin with the problem of quantifying the data storage (memory) and transfer (communication) within information processing systems. We develop an information-theoretic framework to study nonlinear interactions within cooperative and adversarial scenarios, solely from observations of each agent's dynamics. This framework is applied to simulations of robotic soccer games, where the measures reveal insights into team performance, including correlations of the information dynamics to the scoreline. We then study the communication between processes with latent nonlinear dynamics that are observed only through a filter. By using methods from differential topology, we show that the information-theoretic measures commonly used to infer communication in observed systems can also be used in certain partially observed systems. For robotic environmental monitoring, the quality of data depends on the placement of sensors. These locations can be improved by either better estimating the quality of future viewpoints or by a team of robots operating concurrently. By robustly handling the uncertainty of sensor model measurements, we are able to present the first end-to-end robotic system for autonomously tracking small dynamic animals, with a performance comparable to human trackers. We then solve the issue of coordinating multi-robot systems through distributed optimisation techniques. These allow us to develop non-myopic robot trajectories for these tasks and, importantly, show that these algorithms provide guarantees for convergence rates to the optimal payoff sequence

Sydney eScholarship

Applications

Author
Publication venue: Walter de Gruyter GmbH
Publication date: 07/12/2020
Field of study

Pure OAI Repository

Model Order Reduction

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date
Field of study

An increasing complexity of models used to predict real-world systems leads to the need for algorithms to replace complex models with far simpler ones, while preserving the accuracy of the predictions. This three-volume handbook covers methods as well as applications. This third volume focuses on applications in engineering, biomedical engineering, computational physics and computer science

OAPEN Library

The Vessel Schedule Recovery Problem:Disruption management in liner shipping

Author: Brouer Berit Dangaard
Dirksen Jakob
Pisinger David
Plum Christian Edinger Munk
Vaaben Bo
Publication venue
Publication date: 01/01/2012
Field of study

Online Research Database In Technology