47 research outputs found

    Multi Vehicle Trajectory Planning On Road Networks

    Get PDF
    When multiple autonomous vehicles work in a shared space, such as in a surface mine or warehouse, they often travel along specified paths through a static road network. Although these vehicles’ actions and performance are coupled, their motion is often planned myopically or omits cooperation beyond avoiding collisions reactively. More desirable solutions could be achieved by coordinating and planning actions ahead of time. To make multi-vehicle systems more productive and efficient, the thesis introduces planning methods that can optimise for travel time, energy consumption, and trajectory smoothness. Vehicle motion is coordinated by using motion models that combine all trajectories, and avoid collisions. Mathematical programming is then used to find optimised solutions. The proposed methods are shown to significantly reduce solution costs compared to an approach based on common driving practices. As the number of vehicles and interactions between them increases, the number of solutions grows exponentially, making finding a solution computationally challenging. A major aim here was to find high quality solutions within practical computation times. To achieve this, techniques were developed that exploit the structure of the problems. This includes a heuristic algorithm that scales better with problem size, and is combined with the mathematical programming techniques to reduce their complexity. These were found to significantly reduce computation times, trading off marginal solution quality

    Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

    Full text link
    The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks. A signature aim of our group is to use the resources and expertise made available to us at DeepMind in deep reinforcement learning to explore multi-agent systems in complex environments and use these benchmarks to advance our understanding. Here, we summarise the recent work of our team and present a taxonomy that we feel highlights many important open challenges in multi-agent research.Comment: Published in AI Communications 202

    Many-agent Reinforcement Learning

    Get PDF
    Multi-agent reinforcement learning (RL) solves the problem of how each agent should behave optimally in a stochastic environment in which multiple agents are learning simultaneously. It is an interdisciplinary domain with a long history that lies in the joint area of psychology, control theory, game theory, reinforcement learning, and deep learning. Following the remarkable success of the AlphaGO series in single-agent RL, 2019 was a booming year that witnessed significant advances in multi-agent RL techniques; impressive breakthroughs have been made on developing AIs that outperform humans on many challenging tasks, especially multi-player video games. Nonetheless, one of the key challenges of multi-agent RL techniques is the scalability; it is still non-trivial to design efficient learning algorithms that can solve tasks including far more than two agents (N2N \gg 2), which I name by \emph{many-agent reinforcement learning} (MARL\footnote{I use the world of ``MARL" to denote multi-agent reinforcement learning with a particular focus on the cases of many agents; otherwise, it is denoted as ``Multi-Agent RL" by default.}) problems. In this thesis, I contribute to tackling MARL problems from four aspects. Firstly, I offer a self-contained overview of multi-agent RL techniques from a game-theoretical perspective. This overview fills the research gap that most of the existing work either fails to cover the recent advances since 2010 or does not pay adequate attention to game theory, which I believe is the cornerstone to solving many-agent learning problems. Secondly, I develop a tractable policy evaluation algorithm -- αα\alpha^\alpha-Rank -- in many-agent systems. The critical advantage of αα\alpha^\alpha-Rank is that it can compute the solution concept of α\alpha-Rank tractably in multi-player general-sum games with no need to store the entire pay-off matrix. This is in contrast to classic solution concepts such as Nash equilibrium which is known to be PPADPPAD-hard in even two-player cases. αα\alpha^\alpha-Rank allows us, for the first time, to practically conduct large-scale multi-agent evaluations. Thirdly, I introduce a scalable policy learning algorithm -- mean-field MARL -- in many-agent systems. The mean-field MARL method takes advantage of the mean-field approximation from physics, and it is the first provably convergent algorithm that tries to break the curse of dimensionality for MARL tasks. With the proposed algorithm, I report the first result of solving the Ising model and multi-agent battle games through a MARL approach. Fourthly, I investigate the many-agent learning problem in open-ended meta-games (i.e., the game of a game in the policy space). Specifically, I focus on modelling the behavioural diversity in meta-games, and developing algorithms that guarantee to enlarge diversity during training. The proposed metric based on determinantal point processes serves as the first mathematically rigorous definition for diversity. Importantly, the diversity-aware learning algorithms beat the existing state-of-the-art game solvers in terms of exploitability by a large margin. On top of the algorithmic developments, I also contribute two real-world applications of MARL techniques. Specifically, I demonstrate the great potential of applying MARL to study the emergent population dynamics in nature, and model diverse and realistic interactions in autonomous driving. Both applications embody the prospect that MARL techniques could achieve huge impacts in the real physical world, outside of purely video games

    Agoric computation: trust and cyber-physical systems

    Get PDF
    In the past two decades advances in miniaturisation and economies of scale have led to the emergence of billions of connected components that have provided both a spur and a blueprint for the development of smart products acting in specialised environments which are uniquely identifiable, localisable, and capable of autonomy. Adopting the computational perspective of multi-agent systems (MAS) as a technological abstraction married with the engineering perspective of cyber-physical systems (CPS) has provided fertile ground for designing, developing and deploying software applications in smart automated context such as manufacturing, power grids, avionics, healthcare and logistics, capable of being decentralised, intelligent, reconfigurable, modular, flexible, robust, adaptive and responsive. Current agent technologies are, however, ill suited for information-based environments, making it difficult to formalise and implement multiagent systems based on inherently dynamical functional concepts such as trust and reliability, which present special challenges when scaling from small to large systems of agents. To overcome such challenges, it is useful to adopt a unified approach which we term agoric computation, integrating logical, mathematical and programming concepts towards the development of agent-based solutions based on recursive, compositional principles, where smaller systems feed via directed information flows into larger hierarchical systems that define their global environment. Considering information as an integral part of the environment naturally defines a web of operations where components of a systems are wired in some way and each set of inputs and outputs are allowed to carry some value. These operations are stateless abstractions and procedures that act on some stateful cells that cumulate partial information, and it is possible to compose such abstractions into higher-level ones, using a publish-and-subscribe interaction model that keeps track of update messages between abstractions and values in the data. In this thesis we review the logical and mathematical basis of such abstractions and take steps towards the software implementation of agoric modelling as a framework for simulation and verification of the reliability of increasingly complex systems, and report on experimental results related to a few select applications, such as stigmergic interaction in mobile robotics, integrating raw data into agent perceptions, trust and trustworthiness in orchestrated open systems, computing the epistemic cost of trust when reasoning in networks of agents seeded with contradictory information, and trust models for distributed ledgers in the Internet of Things (IoT); and provide a roadmap for future developments of our research

    State Feedback Sliding Mode Control of Complex Systems with Applications

    Get PDF
    This thesis concerns the development of robust nonlinear control design for complex systems including nonholonomic systems and large-scale systems using sliding mode control (SMC) techniques under the assumption that all system state variables are accessible for design. The main developments in this thesis include: 1). The concept of generalised regular form and design of a novel sliding function. The mathematical definition of generalised regular form is proposed for the first time. It is an extension of the classical regular form, which makes SMC applicable to a wider class of nonlinear systems. A novel sliding function design, which is based on the global implicit function theorem, is proposed to guarantee unique sliding mode dynamics. 2). The development of decentralised SMC for large-scale interconnected systems. For systems with uncertain interconnections which possess the superposition property, a decentralised control scheme is presented to counteract the effect of the uncertainty by using bounds on uncertainties and interconnections. The bounds used in the design are nonlinear functions instead of constant, linear or polynomial functions. The design strategy has also been expanded to a fully nonlinear case for interconnected systems in the generalised regular form. 3). Robust decentralised SMC for a class of nonlinear systems with uncertainties in input distribution. A system with uncertainties in input distribution is full of challenges. A novel method is proposed to deal with such uncertainties for a class of nonlinear interconnected systems. The designed decentralised SMC enhances the robustness of the controlled systems. This thesis also provides case studies of three applications for the proposed approaches. The existence of the generalised regular form is verified in the trajectory tracking control of a wheeled mobile robot (WMR) system. Both simulations and experiments on the WMR are given to demonstrate the validity and effectiveness of the generalised regular form-based SMC design. A continuous stirred tank reactor (CSTR) system and a longitudinal vehicle-following system are used to test the proposed decentralised SMC schemes. An expanded vehicle-following system with both longitudinal and lateral controllers has been developed to demonstrate the robust control design for system with uncertainties in input distribution
    corecore