Search CORE

16 research outputs found

Learning Reward Machines in Cooperative Multi-Agent Tasks

Author: Ardon Leo
Furelos-Blanco Daniel
Russo Alessandra
Publication venue
Publication date: 31/03/2023
Field of study

This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) that combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments and improves the interpretability of the learnt policies required to complete the cooperative task. The RMs associated with each sub-task are learnt in a decentralised manner and then used to guide the behaviour of each agent. By doing so, the complexity of a cooperative multi-agent problem is reduced, allowing for more effective learning. The results suggest that our approach is a promising direction for future research in MARL, especially in complex environments with large state spaces and multiple agents.Comment: Neuro-symbolic AI for Agent and Multi-Agent Systems Workshop at AAMAS'2

arXiv.org e-Print Archive

Population-Based Reinforcement Learning for Combinatorial Optimization

Author: Barrett Thomas D.
Furelos-Blanco Daniel
Grinsztajn Nathan
Publication venue
Publication date: 07/10/2022
Field of study

Applying reinforcement learning (RL) to combinatorial optimization problems is attractive as it removes the need for expert knowledge or pre-solved instances. However, it is unrealistic to expect an agent to solve these (often NP-)hard problems in a single shot at inference due to their inherent complexity. Thus, leading approaches often implement additional search strategies, from stochastic sampling and beam-search to explicit fine-tuning. In this paper, we argue for the benefits of learning a population of complementary policies, which can be simultaneously rolled out at inference. To this end, we introduce Poppy, a simple theoretically grounded training procedure for populations. Instead of relying on a predefined or hand-crafted notion of diversity, Poppy induces an unsupervised specialization targeted solely at maximizing the performance of the population. We show that Poppy produces a set of complementary policies, and obtains state-of-the-art RL results on three popular NP-hard problems: the traveling salesman (TSP), the capacitated vehicle routing (CVRP), and 0-1 knapsack (KP) problems. On TSP specifically, Poppy outperforms the previous state-of-the-art, dividing the optimality gap by 5 while reducing the inference time by more than an order of magnitude

arXiv.org e-Print Archive

Induction of Subgoal Automata for Reinforcement Learning

Author: Broda Krysia
Furelos-Blanco Daniel
Jonsson Anders
Law Mark
Russo Alessandra
Publication venue
Publication date: 29/11/2019
Field of study

In this work we present ISA, a novel approach for learning and exploiting subgoals in reinforcement learning (RL). Our method relies on inducing an automaton whose transitions are subgoals expressed as propositional formulas over a set of observable events. A state-of-the-art inductive logic programming system is used to learn the automaton from observation traces perceived by the RL agent. The reinforcement learning and automaton learning processes are interleaved: a new refined automaton is learned whenever the RL agent generates a trace not recognized by the current automaton. We evaluate ISA in several gridworld problems and show that it performs similarly to a method for which automata are given in advance. We also show that the learned automata can be exploited to speed up convergence through reward shaping and transfer learning across multiple tasks. Finally, we analyze the running time and the number of traces that ISA needs to learn an automata, and the impact that the number of observable events has on the learner's performance.Comment: Preprint accepted for publication to the 34th AAAI Conference on Artificial Intelligence (AAAI-20

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Collective adaptation through concurrent planning: the case of sustainable urban mobility

Author: Bucchiarone Antonio
Furelos-Blanco Daniel
Jonsson Anders
Khandokar Fahmida
Mourshed Monjur
Publication venue: IFAAMAS
Publication date: 10/07/2018
Field of study

In this paper we address the challenges that impede collective adaptation in smart mobility systems by proposing a notion of ensembles. Ensembles enable systems with collective adaptability to be built as emergent aggregations of autonomous and self-adaptive agents. Adaptation in these systems is triggered by a run-time occurrence, which is known as an issue. The novel aspect of our approach is, it allows agents affected by an issue in the context of a smart mobility scenario to adapt collaboratively with minimal impact on their own preferences through an issue resolution process based on concurrent planning algorithms

Online Research @ Cardiff

Learning and Generalization in Atari Games

Author: Furelos Blanco Daniel
Publication venue
Publication date: 21/12/2015
Field of study

Treball de fi de grau en informàticaTutor: Anders JonssonThis thesis describes the design of agents that learn to play Atari games using the Arcade Learning Environment (ALE) framework to interact with them. The application of machine learning in video games, given its high complexity, is considered to be a bridge towards real-world domains such as robotics. The goal in Atari games is to achieve the highest possible score. To solve this task, reinforcement learning and search techniques are used. These algorithms outperform humans in 30 of the 61 games supported by ALE. Since humans are very good at making generalizations between games, special emphasis is/ngiven to evaluating how well an agent learns from multiple games simultaneously. These experiments usually result in a higher score for specific pairs of games. Besides, there are games that tend to increase their score when playing with other games, whereas there are games that help others to perform better.Aquesta tesis descriu el disseny d'agents que aprenen a jugar a jocs d'Atari utilitzant el/nframework Arcade Learning Environment (ALE) per a interactuar amb ells. L'aplicació/nd'aprenentatge automàtic en videojocs, donada la seva alta complexitat, es considera un/npont cap a dominis com la robòtica./nL'objectiu als jocs d'Atari és aconseguir la major puntuació possible. Per a resoldre aquesta/ntasca, s'utilitzen tècniques d'aprenentatge per reforç i cerca. Aquests algoritmes superen/nals humans en 30 dels 61 jocs suportats per ALE./nCom els humans són molt bons fent generalitzacions entre jocs, es fa especial èmfasi en/navaluar com un agent pot aprendre de múltiples jocs jugats simultàniament. Aquests/nexperiments solen resultar en una major puntuació per a parelles específiques de jocs. A/nmés, hi ha jocs que tendeixen a incrementar la seva puntuació quan juguen amb altres,/nmentre que també hi ha jocs que ajuden a altres a actuar millor.Esta tesis describe el diseno de agentes que aprenden a jugar a juegos de Atari usando el/nframework Arcade Learning Environment (ALE) para interactuar con ellos. La aplicación/nde aprendizaje automático en videojuegos, dada su alta complejidad, se considera un/npuente hacia dominios como la robótica./nEl objetivo en los juegos de Atari es conseguir la mayor puntuación posible. Para resolver/nesta tarea, se utilizan técnicas de aprendizaje por refuerzo y búsqueda. Estos algoritmos/nsuperan a los humanos en 30 de los 61 juegos soportados por ALE./nComo los humanos son muy buenos haciendo generalizaciones entre juegos, se hace especial énfasis en evaluar cómo un agente puede aprender de múltiples juegos jugados simultáneamente. Estos experimentos suelen resultar en una mayor puntuación para pares/nespecíficos de juegos. Además, hay juegos que tienden a incrementar su puntuación cuando/njuegan con otros, mientras que también hay juegos que ayudan a otros a actuar mejor

UPF Digital Repository

Resolution of concurrent planning problems using classical planning

Author: Furelos Blanco Daniel
Publication venue
Publication date: 01/09/2017
Field of study

Tutor: Anders JonssonTreball fi de màster de: Master in Intelligent Interactive SystemsIn this work, we present new approaches for solving multiagent planning and temporal planning problems. These planning forms are two types of concurrent planning, where actions occur in parallel. The methods we propose rely on a compilation to classical planning problems that can be solved using an off-the-shelf classical planner. Then, the solutions can be converted back into multiagent or temporal solutions. Our compilation for multiagent planning is able to generate concurrent actions that satisfy a set of concurrency constraints. Furthermore, it avoids the exponential blowup associated with concurrent actions, a problem that many multiagent planners are facing nowadays. Incorporating similar ideas in temporal planning enables us to generate temporal plans with simultaneous events, which most state-of-the-art temporal planners cannot do. In experiments, we compare our approaches to other approaches. We show that the methods using transformations to classical planning are able to get better results than state-of-the-art approaches for complex problems. In contrast, we also highlight some of the drawbacks that this kind of methods have for both multiagent and temporal planning. We also illustrate how these methods can be applied to real world domains like the smart mobility domain. In this domain, a group of vehicles and passengers must self-adapt in order to reach their target positions. The adaptation process consists in running a concurrent planning algorithm. The behavior of the approach is then evaluated

UPF Digital Repository

Solving multiagent planning problems with concurrent conditional effects

Author: Furelos Blanco Daniel
Jonsson Anders, 1973-
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/01/2019
Field of study

Comunicació presentada al 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, celebrat del 27 de gener a l'1 de febrer de 2019 a Palo Alta, EEUU.In this work we present a novel approach to solving concurrent multiagent planning problems in which several agents act in parallel. Our approach relies on a compilation from concurrent multiagent planning to classical planning, allowing us to use an off-the-shelf classical planner to solve the original multiagent problem. The solution can be directly interpreted as a concurrent plan that satisfies a given set of concurrency constraints, while avoiding the exponential blowup associated with concurrent actions. Our planner is the first to handle action effects that are conditional on what other agents are doing. Theoretically, we show that the compilation is sound and complete. Empirically, we show that our compilation can solve challenging multiagent planning problems that require concurrent actions.This work has been supported by the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502). Anders Jonsson is partially supported by the grants TIN2015-67959 and PCIN-2017-082 of the Spanish Ministry of Science

arXiv.org e-Print Archive

Crossref

UPF Digital Repository

Association for the Advancement of Artificial Intelligence: AAAI Publications

Solving concurrent multiagent planning using classical planning

Author: Furelos Blanco Daniel
Jonsson Anders, 1973-
Publication venue: Association for the Advancement of Artificial Intelligence (AAAI) - Congrés ICAPS17
Publication date: 01/01/2018
Field of study

Comunicació presentada al 6th Workshop on Distributed and Multi-Agent Planning (DMAP 2018), celebrat durant la 28th International Conference on Automated Planning and Scheduling, els dies 24 a 29 de juny de 2018 a Delft, Països Baixos.In this work we present a novel approach to solving concurrent multiagent planning problems in which several agents act in parallel. Our approach relies on a compilation from concurrent multiagent planning to classical planning, allowing us to use an off-the-shelf classical planner to solve the original multiagent problem. The solution can be directly interpreted as a concurrent plan that satisfies a given set of concurrency constraints, while avoiding the exponential blowup associated with concurrent actions. Theoretically, we show that the compilation is sound and complete. Empirically, we show that our compilation can solve challenging multiagent planning problems that require concurrent actions.This work has been supported by the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502)

UPF Digital Repository

Solving concurrent multiagent planning using classical planning

Author: Furelos Blanco Daniel
Jonsson Anders, 1973-
Publication venue: Association for the Advancement of Artificial Intelligence (AAAI) - Congrés ICAPS17
Publication date
Field of study

RECERCAT

CARPooL: Collective Adaptation using concuRrent PLanning

Author: Bucchiarone Antonio
Furelos Blanco Daniel
Jonsson Anders, 1973-
Publication venue: International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Publication date: 01/01/2018
Field of study

Comunicació presentada a la 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2018), celebrada a Stockholm del 10 al 15 de juliol de 2018.In this paper we present the CARPooL demonstrator, an implementation of a Collective Adaptation Engine (CAE) that addresses the challenge of collective adaptation in the smart mobility domain. CARPooL resolves adaptation issues via concurrent planning techniques. It also allows to interact with the provided solutions by adding new issues or analyzing the actions done by each agent.This work has been partially supported by the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502)

Archivio della ricerca - Fondazione Bruno Kessler

UPF Digital Repository