14 research outputs found

    Planning spatial networks with Monte Carlo tree search

    Get PDF
    We tackle the problem of goal-directed graph construction: given a starting graph, finding a set of edges whose addition maximally improves a global objective function. This problem emerges in many transportation and infrastructure networks that are of critical importance to society. We identify two significant shortcomings of present reinforcement learning methods: their exclusive focus on topology to the detriment of spatial characteristics (which are known to influence the growth and density of links), as well as the rapid growth in the action spaces and costs of model training. Our formulation as a deterministic Markov decision process allows us to adopt the Monte Carlo tree search framework, an artificial intelligence decision-time planning method. We propose improvements over the standard upper confidence bounds for trees (UCT) algorithm for this family of problems that addresses their single-agent nature, the trade-off between the cost of edges and their contribution to the objective, and an action space linear in the number of nodes. Our approach yields substantial improvements over UCT for increasing the efficiency and attack resilience of synthetic networks and real-world Internet backbone and metro systems, while using a wall clock time budget similar to other search-based algorithms. We also demonstrate that our approach scales to significantly larger networks than previous reinforcement learning methods, since it does not require training a model

    Solving Graph-based Public Good Games with Tree Search and Imitation Learning

    Get PDF
    Public goods games represent insightful settings for studying incentives for individual agents to make contributions that, while costly for each of them, benefit the wider society. In this work, we adopt the perspective of a central planner with a global view of a network of self-interested agents and the goal of maximizing some desired property in the context of a best-shot public goods game. Existing algorithms for this known NP-complete problem find solutions that are sub-optimal and cannot optimize for criteria other than social welfare.In order to efficiently solve public goods games, our proposed method directly exploits the correspondence between equilibria and the Maximal Independent Set (mIS) structural property of graphs. In particular, we define a Markov Decision Process which incrementally generates an mIS, and adopt a planning method to search for equilibria, outperforming existing methods. Furthermore, we devise a graph imitation learning technique that uses demonstrations of the search to obtain a graph neural network parametrized policy which quickly generalizes to unseen game instances. Our evaluation results show that this policy is able to reach 99.5\% of the performance of the planning method while being three orders of magnitude faster to evaluate on the largest graphs tested. The methods presented in this work can be applied to a large class of public goods games of potentially high societal impact and more broadly to other graph combinatorial optimization problems

    RLQ: Workload Allocation With Reinforcement Learning in Distributed Queues

    Get PDF
    Distributed workload queues are nowadays widely used due to their significant advantages in terms of decoupling, resilience, and scaling. Task allocation to worker nodes in distributed queue systems is typically simplistic (e.g., Least Recently Used) or uses hand-crafted heuristics that require task-specific information (e.g., task resource demands or expected time of execution). When such task information is not available and worker node capabilities are not homogeneous, the existing placement strategies may lead to unnecessarily large execution timings and usage costs. In this work, we formulate the task allocation problem in the Markov Decision Process framework, in which an agent assigns tasks to an available resource, and receives a numerical reward signal upon task completion. Our adaptive and learning-based task allocation solution, Reinforcement Learning based Queues ( RLQ ), is implemented and integrated with the popular Celery task queuing system for Python. We compare RLQ against traditional solutions using both synthetic and real workload traces. On average, using synthetic workloads, RLQ reduces the execution cost by approximately 70%, the execution time by a factor of at least 3×, and the waiting time by almost 7×. Using real traces, we observe an improvement of about 20% for execution cost, around 70% improvement for execution time, and a reduction of approximately 20× in waiting time. We also compare RLQ with a strategy inspired by E-PVM, a state-of-the-art solution used in Google's Borg cluster manager, showing we are able to outperform it in five out of six scenarios

    RLQ: Workload Allocation With Reinforcement Learning in Distributed Queues

    Get PDF
    Distributed workload queues are nowadays widely used due to their significant advantages in terms of decoupling, resilience, and scaling. Task allocation to worker nodes in distributed queue systems is typically simplistic (e.g., Least Recently Used) or uses hand-crafted heuristics that require task-specific information (e.g., task resource demands or expected time of execution). When such task information is not available and worker node capabilities are not homogeneous, the existing placement strategies may lead to unnecessarily large execution timings and usage costs. In this work, we formulate the task allocation problem in the Markov Decision Process framework, in which an agent assigns tasks to an available resource, and receives a numerical reward signal upon task completion. Our adaptive and learning-based task allocation solution, Reinforcement Learning based Queues (RLQ), is implemented and integrated with the popular Celery task queuing system for Python. We compare RLQ against traditional solutions using both synthetic and real workload traces. On average, using synthetic workloads, RLQ reduces the execution cost by approximately 70%, the execution time by a factor of at least 3×, and the waiting time by almost 7×. Using real traces, we observe an improvement of about 20% for execution cost, around 70% improvement for execution time, and a reduction of approximately 20× in waiting time. We also compare RLQ with a strategy inspired by E-PVM, a state-of-the-art solution used in Google's Borg cluster manager, showing we are able to outperform it in five out of six scenarios

    Learning to Optimise Networked Systems

    No full text
    Many systems based on relations between connected entities find a natural representation in graphs, which has led to the development of mathematical and statistical tools for understanding their structure and the phenomena that take place over them. There is comparatively little work on the study of optimising the outcome of processes on graphs with respect to a given objective function. Problems of this nature are combinatorial optimisation tasks, which are challenging for systems beyond a trivial size due to the rapid growth of the solution space. Traditional ways of approaching such problems use either heavily tailored, objective-specific approaches or generic metaheuristics. Machine learning and decision-making algorithms have begun to emerge as an alternative data-driven paradigm for navigating the search space, allowing for generalisation between related problems and effective scaling to larger instances than those seen during training. This thesis contributes problem formulations, solution methods, and learning representations for approximately solving several graph combinatorial optimisation problems. We first address constructing a graph for optimising a structural objective such as resilience or efficiency. We propose a deep reinforcement learning approach that uses graph neural networks as a key component for generalisation and a complementary extension of Monte Carlo Tree Search, which is suitable for spatial networks. Next, we study public goods games over networks, discussing an approach for effective imitation learning of policies mapping graph states to node selection actions. Finally, we focus on data-driven routing of flows over graphs, proposing a novel graph neural network architecture for this family of problems. Our evaluation results show significant advantages for the proposed approaches over prior methods, representing a contribution to the broader area of machine learning for combinatorial optimisation. The research discussed in this thesis may find meaningful applications in domains as diverse as structural engineering, urban planning, operations research, and computer networking

    Goal-directed graph construction using reinforcement learning

    Get PDF
    Graphs can be used to represent and reason about systems and a variety of metrics have been devised to quantify their global characteristics. However, little is currently known about how to construct a graph or improve an existing one given a target objective. In this work, we formulate the construction of a graph as a decision-making process in which a central agent creates topologies by trial and error and receives rewards proportional to the value of the target objective. By means of this conceptual framework, we propose an algorithm based on reinforcement learning and graph neural networks to learn graph construction and improvement strategies. Our core case study focuses on robustness to failures and attacks, a property relevant for the infrastructure and communication networks that power modern society. Experiments on synthetic and real-world graphs show that this approach can outperform existing methods while being cheaper to evaluate. It also allows generalization to out-of-sample graphs, as well as to larger out-of-distribution graphs in some cases. The approach is applicable to the optimization of other global structural properties of graphs

    GRAPH Reinforcement Learning for Operator Selection in the ALNS Metaheuristic

    No full text
    ALNS is a popular metaheuristic with renowned efficiency in solving combinatorial optimisation problems. However, despite 16 years of intensive research into ALNS, whether the embedded adaptive layer can efficiently select operators to improve the incumbent remains an open question. In this work, we formulate the choice of operators as a Markov Decision Process, and propose a practical approach based on Deep Reinforcement Learning and Graph Neural Networks. The results show that our proposed method achieves better performance than the classic ALNS adaptive layer due to the choice of operator being conditioned on the current solution. We also discuss important considerations such as the size of the operator portfolio and the impact of the choice of operator scales. Notably, our approach can also save significant time and labour costs for handcrafting problem-specific operator portfolios

    Quantifying the Relationships between Everyday Objects and Emotional States through Deep Learning Based Image Analysis Using Smartphones

    No full text
    There has been an increasing interest in the problem of inferring emotional states of individuals using sensor and user-generated information as diverse as GPS traces, social media data and smartphone interaction patterns. One aspect that has received little attention is the use of visual context information extracted from the surroundings of individuals and how they relate to it. In this paper, we present an observational study of the relationships between the emotional states of individuals and objects present in their visual environment automatically extracted from smartphone images using deep learning techniques. We developed MyMood, a smartphone application that allows users to periodically log their emotional state together with pictures from their everyday lives, while passively gathering sensor measurements. We conducted an in-the-wild study with 22 participants and collected 3,305 mood reports with photos. Our findings show context-dependent associations between objects surrounding individuals and self-reported emotional state intensities. The applications of this work are potentially many, from the design of interior and outdoor spaces to the development of intelligent applications for positive behavioral intervention, and more generally for supporting computational psychology studies

    Trust-based Consensus in Multi-Agent Reinforcement Learning Systems

    Full text link
    An often neglected issue in multi-agent reinforcement learning (MARL) is the potential presence of unreliable agents in the environment whose deviations from expected behavior can prevent a system from accomplishing its intended tasks. In particular, consensus is a fundamental underpinning problem of cooperative distributed multi-agent systems. Consensus requires different agents, situated in a decentralized communication network, to reach an agreement out of a set of initial proposals that they put forward. Learning-based agents should adopt a protocol that allows them to reach consensus despite having one or more unreliable agents in the system. This paper investigates the problem of unreliable agents in MARL, considering consensus as case study. Echoing established results in the distributed systems literature, our experiments show that even a moderate fraction of such agents can greatly impact the ability of reaching consensus in a networked environment. We propose Reinforcement Learning-based Trusted Consensus (RLTC), a decentralized trust mechanism, in which agents can independently decide which neighbors to communicate with. We empirically demonstrate that our trust mechanism is able to deal with unreliable agents effectively, as evidenced by higher consensus success rates.Comment: 18 pages, 17 figure
    corecore