Search CORE

1,343 research outputs found

The robot routing problem for collecting aggregate stochastic rewards

Author: Dimitrova R.
Gavran I.
Majumdar R.
Prabhu V.S.
Soudjani S.E.Z.
Publication venue: Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Publication date: 01/01/2017
Field of study

We propose a new model for formalizing reward collection problems on graphs with dynamically generated rewards which may appear and disappear based on a stochastic model. The robot routing problem is modeled as a graph whose nodes are stochastic processes generating potential rewards over discrete time. The rewards are generated according to the stochastic process, but at each step, an existing reward disappears with a given probability. The edges in the graph encode the (unit-distance) paths between the rewards' locations. On visiting a node, the robot collects the accumulated reward at the node at that time, but traveling between the nodes takes time. The optimization question asks to compute an optimal (or epsilon-optimal) path that maximizes the expected collected rewards. We consider the finite and infinite-horizon robot routing problems. For finite-horizon, the goal is to maximize the total expected reward, while for infinite horizon we consider limit-average objectives. We study the computational and strategy complexity of these problems, establish NP-lower bounds and show that optimal strategies require memory in general. We also provide an algorithm for computing epsilon-optimal infinite paths for arbitrary epsilon > 0

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

Dagstuhl Research Online Publication Server

White Rose Research Online

MPG.PuRe

Leicester Research Archive

Heuristics for the traveling repairman problem with profits

Author: Dewilde Thijs
Dirk Cattrysse
Sofie Coene
Spieksma Frits CR
Vansteenwegen Pieter
Publication venue: Department of Computer Science, University of Liverpool
Publication date: 01/01/2010
Field of study

In the traveling repairman problem with profits, a repairman (also known as the server) visits a subset of nodes in order to collect time-dependent profits. The objective consists of maximizing the total collected revenue. We restrict our study to the case of a single server with nodes located in the Euclidean plane. We investigate properties of this problem, and we derive a mathematical model assuming that the number of visited nodes is known in advance. We describe a tabu search algorithm with multiple neighborhoods, and we test its performance by running it on instances based on TSPLIB. We conclude that the tabu search algorithm finds good-quality solutions fast, even for large instances

Ghent University Academic Bibliography

Can Differentiable Decision Trees Learn Interpretable Reward Functions?

Author: Brown Daniel S.
Kalra Akansha
Publication venue
Publication date: 22/06/2023
Field of study

There is an increasing interest in learning reward functions that model human intent and human preferences. However, many frameworks use blackbox learning methods that, while expressive, are difficult to interpret. We propose and evaluate a novel approach for learning expressive and interpretable reward functions from preferences using Differentiable Decision Trees (DDTs) for both low- and high-dimensional state inputs. We explore and discuss the viability of learning interpretable reward functions using DDTs by evaluating our algorithm on Cartpole, Visual Gridworld environments, and Atari games. We provide evidence that that the tree structure of our learned reward function is useful in determining the extent to which a reward function is aligned with human preferences. We visualize the learned reward DDTs and find that they are capable of learning interpretable reward functions but that the discrete nature of the trees hurts the performance of reinforcement learning at test time. However, we also show evidence that using soft outputs (averaged over all leaf nodes) results in competitive performance when compared with larger capacity deep neural network reward functions

arXiv.org e-Print Archive

Advances in Reinforcement Learning

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic

Directory of Open Access Books (DOAB)

Online planning for multi-robot active perception with self-organising maps

Author: Best G
Faigl J
Fitch R
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2018
Field of study

© 2017, Springer Science+Business Media, LLC, part of Springer Nature. We propose a self-organising map (SOM) algorithm as a solution to a new multi-goal path planning problem for active perception and data collection tasks. We optimise paths for a multi-robot team that aims to maximally observe a set of nodes in the environment. The selected nodes are observed by visiting associated viewpoint regions defined by a sensor model. The key problem characteristics are that the viewpoint regions are overlapping polygonal continuous regions, each node has an observation reward, and the robots are constrained by travel budgets. The SOM algorithm jointly selects and allocates nodes to the robots and finds favourable sequences of sensing locations. The algorithm has a runtime complexity that is polynomial in the number of nodes to be observed and the magnitude of the relative weighting of rewards. We show empirically the runtime is sublinear in the number of robots. We demonstrate feasibility for the active perception task of observing a set of 3D objects. The viewpoint regions consider sensing ranges and self-occlusions, and the rewards are measured as discriminability in the ensemble of shape functions feature space. Exploration objectives for online tasks where the environment is only partially known in advance are modelled by introducing goal regions in unexplored space. Online replanning is performed efficiently by adapting previous solutions as new information becomes available. Simulations were performed using a 3D point-cloud dataset from a real robot in a large outdoor environment. Our results show the proposed methods enable multi-robot planning for online active perception tasks with continuous sets of candidate viewpoints and long planning horizons

OPUS - University of Technology Sydney

Strategies for Scaleable Communication and Coordination in Multi-Agent (UAV) Systems

Author: Dantsker Or D.
Ponniah Jonathan
Publication venue: 'MDPI AG'
Publication date: 01/09/2022
Field of study

A system is considered in which agents (UAVs) must cooperatively discover interest-points (i.e., burning trees, geographical features) evolving over a grid. The objective is to locate as many interest-points as possible in the shortest possible time frame. There are two main problems: a control problem, where agents must collectively determine the optimal action, and a communication problem, where agents must share their local states and infer a common global state. Both problems become intractable when the number of agents is large. This survey/concept paper curates a broad selection of work in the literature pointing to a possible solution; a unified control/communication architecture within the framework of reinforcement learning. Two components of this architecture are locally interactive structure in the state-space, and hierarchical multi-level clustering for system-wide communication. The former mitigates the complexity of the control problem and the latter adapts to fundamental throughput constraints in wireless networks. The challenges of applying reinforcement learning to multi-agent systems are discussed. The role of clustering is explored in multi-agent communication. Research directions are suggested to unify these components

SJSU ScholarWorks

Energy-Constrained Active Exploration Under Incremental-Resolution Symbolic Perception

Author: Haesaert Sofie
Kamale Disha
Vasile Cristian-Ioan
Publication venue
Publication date: 13/09/2023
Field of study

In this work, we consider the problem of autonomous exploration in search of targets while respecting a fixed energy budget. The robot is equipped with an incremental-resolution symbolic perception module wherein the perception of targets in the environment improves as the robot's distance from targets decreases. We assume no prior information about the total number of targets, their locations as well as their possible distribution within the environment. This work proposes a novel decision-making framework for the resulting constrained sequential decision-making problem by first converting it into a reward maximization problem on a product graph computed offline. It is then solved online as a Mixed-Integer Linear Program (MILP) where the knowledge about the environment is updated at each step, combining automata-based and MILP-based techniques. We demonstrate the efficacy of our approach with the help of a case study and present empirical evaluation in terms of expected regret. Furthermore, the runtime performance shows that online planning can be efficiently performed for moderately-sized grid environments

arXiv.org e-Print Archive