2,784 research outputs found

    A Particle Swarm Based Algorithm for Functional Distributed Constraint Optimization Problems

    Full text link
    Distributed Constraint Optimization Problems (DCOPs) are a widely studied constraint handling framework. The objective of a DCOP algorithm is to optimize a global objective function that can be described as the aggregation of a number of distributed constraint cost functions. In a DCOP, each of these functions is defined by a set of discrete variables. However, in many applications, such as target tracking or sleep scheduling in sensor networks, continuous valued variables are more suited than the discrete ones. Considering this, Functional DCOPs (F-DCOPs) have been proposed that is able to explicitly model a problem containing continuous variables. Nevertheless, the state-of-the-art F-DCOPs approaches experience onerous memory or computation overhead. To address this issue, we propose a new F-DCOP algorithm, namely Particle Swarm Based F-DCOP (PFD), which is inspired by a meta-heuristic, Particle Swarm Optimization (PSO). Although it has been successfully applied to many continuous optimization problems, the potential of PSO has not been utilized in F-DCOPs. To be exact, PFD devises a distributed method of solution construction while significantly reducing the computation and memory requirements. Moreover, we theoretically prove that PFD is an anytime algorithm. Finally, our empirical results indicate that PFD outperforms the state-of-the-art approaches in terms of solution quality and computation overhead

    Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

    Full text link
    In many real-world settings, a team of agents must coordinate its behaviour while acting in a decentralised fashion. At the same time, it is often possible to train the agents in a centralised fashion where global state information is available and communication constraints are lifted. Learning joint action-values conditioned on extra state information is an attractive way to exploit centralised learning, but the best strategy for then extracting decentralised policies is unclear. Our solution is QMIX, a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. QMIX employs a mixing network that estimates joint action-values as a monotonic combination of per-agent values. We structurally enforce that the joint-action value is monotonic in the per-agent values, through the use of non-negative weights in the mixing network, which guarantees consistency between the centralised and decentralised policies. To evaluate the performance of QMIX, we propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning. We evaluate QMIX on a challenging set of SMAC scenarios and show that it significantly outperforms existing multi-agent reinforcement learning methods.Comment: Extended version of the ICML 2018 conference paper (arXiv:1803.11485

    Topics in Electromobility and Related Applications

    Get PDF
    In this thesis, we mainly discuss four topics on Electric Vehicles (EVs) in the context of smart grid and smart transportation systems. The first topic focuses on investigating the impacts of different EV charging strategies on the grid. In Chapter 3, we present a mathematical framework for formulating different EV charging problems and investigate a range of typical EV charging strategies with respect to different actors in the power system. Using this framework, we compare the performances of all charging strategies on a common power system simulation testbed, highlighting in each case positive and negative characteristics. The second topic is concerned with the applications of EVs with Vehicle-to-Grid (V2G) capabilities. In Chapter 4, we apply certain ideas from cooperative control techniques to two V2G applications in different scenarios. In the first scenario, we harness the power of V2G technologies to reduce current imbalance in a three-phase power network. In the second scenario, we design a fair V2G programme to optimally determine the power dispatch from EVs in a microgrid scenario. The effectiveness of the proposed algorithms are verified through a variety of simulation studies. The third topic discusses an optimal distributed energy management strategy for power generation in a microgrid scenario. In Chapter 5, we adapt the synchronised version of the Additive-Increase-Multiplicative-Decrease (AIMD) algorithms to minimise a cost utility function related to the power generation costs of distributed resources. We investigate the AIMD based strategy through simulation studies and we illustrate that the performance of the proposed method is very close to the full communication centralised case. Finally, we show that this idea can be easily extended to another application including thermal balancing requirements. The last topic focuses on a new design of the Speed Advisory System (SAS) for optimising both conventional and electric vehicles networks. In Chapter 6, we demonstrate that, by using simple ideas, one can design an effective SAS for electric vehicles to minimise group energy consumption in a distributed and privacy-aware manner; Matlab simulation are give to illustrate the effectiveness of this approach. Further, we extend this idea to conventional vehicles in Chapter 7 and we show that by using some of the ideas introduced in Chapter 6, group emissions of conventional vehicles can also be minimised under the same SAS framework. SUMO simulation and Hardware-In-the-Loop (HIL) tests involving real vehicles are given to illustrate user acceptability and ease of deployment. Finally, note that many applications in this thesis are based on the theories of a class of nonlinear iterative feedback systems. For completeness, we present a rigorous proof on global convergence of consensus of such systems in Chapter 2

    Information-Theoretic Control of Multiple Sensor Platforms

    Get PDF
    This thesis is concerned with the development of a consistent, information-theoretic basis for understanding of coordination and cooperation decentralised multi-sensor multi-platform systems. Autonomous systems composed of multiple sensors and multiple platforms potentially have significant importance in applications such as defence, search and rescue mining or intelligent manufacturing. However, the effective use of multiple autonomous systems requires that an understanding be developed of the mechanisms of coordination and cooperation between component systems in pursuit of a common goal. A fundamental, quantitative, understanding of coordination and cooperation between decentralised autonomous systems is the main goal of this thesis. This thesis focuses on the problem of coordination and cooperation for teams of autonomous systems engaged in information gathering and data fusion tasks. While this is a subset of the general cooperative autonomous systems problem, it still encompasses a range of possible applications in picture compilation, navigation, searching and map building problems. The great advantage of restricting the domain of interest in this way is that an underlying mathematical model for coordination and cooperation can be based on the use of information-theoretic models of platform and sensor abilities. The information theoretic approach builds on the established principles and architecture previously developed for decentralised data fusion systems. In the decentralised control problem addressed in this thesis, each platform and sensor system is considered to be a distinct decision maker with an individual information-theoretic utility measure capturing both local objectives and the inter-dependencies among the decisions made by other members of the team. Together these information-theoretic utilities constitute the team objective. The key contributions of this thesis lie in the quantification and study of cooperative control between sensors and platforms using information as a common utility measure. In particular, * The problem of information gathering is formulated as an optimal control problem by identifying formal measures of information with utility or pay-off. * An information-theoretic utility model of coupling and coordination between decentralised decision makers is elucidated. This is used to describe how the information gathering strategies of a team of autonomous systems are coupled. * Static and dynamic information structures for team members are defined. It is shown that the use of static information structures can lead to efficient, although sub-optimal, decentralised control strategies for the team. * Significant examples in decentralised control of a team of sensors are developed. These include the multi-vehicle multi-target bearings-only tracking problem, and the area coverage or exploration problem for multiple vehicles. These examples demonstrate the range of non-trivial problems to which the theory in this thesis can be employed

    Information-Theoretic Control of Multiple Sensor Platforms

    Get PDF
    This thesis is concerned with the development of a consistent, information-theoretic basis for understanding of coordination and cooperation decentralised multi-sensor multi-platform systems. Autonomous systems composed of multiple sensors and multiple platforms potentially have significant importance in applications such as defence, search and rescue mining or intelligent manufacturing. However, the effective use of multiple autonomous systems requires that an understanding be developed of the mechanisms of coordination and cooperation between component systems in pursuit of a common goal. A fundamental, quantitative, understanding of coordination and cooperation between decentralised autonomous systems is the main goal of this thesis. This thesis focuses on the problem of coordination and cooperation for teams of autonomous systems engaged in information gathering and data fusion tasks. While this is a subset of the general cooperative autonomous systems problem, it still encompasses a range of possible applications in picture compilation, navigation, searching and map building problems. The great advantage of restricting the domain of interest in this way is that an underlying mathematical model for coordination and cooperation can be based on the use of information-theoretic models of platform and sensor abilities. The information theoretic approach builds on the established principles and architecture previously developed for decentralised data fusion systems. In the decentralised control problem addressed in this thesis, each platform and sensor system is considered to be a distinct decision maker with an individual information-theoretic utility measure capturing both local objectives and the inter-dependencies among the decisions made by other members of the team. Together these information-theoretic utilities constitute the team objective. The key contributions of this thesis lie in the quantification and study of cooperative control between sensors and platforms using information as a common utility measure. In particular, * The problem of information gathering is formulated as an optimal control problem by identifying formal measures of information with utility or pay-off. * An information-theoretic utility model of coupling and coordination between decentralised decision makers is elucidated. This is used to describe how the information gathering strategies of a team of autonomous systems are coupled. * Static and dynamic information structures for team members are defined. It is shown that the use of static information structures can lead to efficient, although sub-optimal, decentralised control strategies for the team. * Significant examples in decentralised control of a team of sensors are developed. These include the multi-vehicle multi-target bearings-only tracking problem, and the area coverage or exploration problem for multiple vehicles. These examples demonstrate the range of non-trivial problems to which the theory in this thesis can be employed

    Optimal speed trajectory and energy management control for connected and automated vehicles

    Get PDF
    Connected and automated vehicles (CAVs) emerge as a promising solution to improve urban mobility, safety, energy efficiency, and passenger comfort with the development of communication technologies, such as vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I). This thesis proposes several control approaches for CAVs with electric powertrains, including hybrid electric vehicles (HEVs) and battery electric vehicles (BEVs), with the main objective to improve energy efficiency by optimising vehicle speed trajectory and energy management system. By types of vehicle control, these methods can be categorised into three main scenarios, optimal energy management for a single CAV (single-vehicle), energy-optimal strategy for the vehicle following scenario (two-vehicle), and optimal autonomous intersection management for CAVs (multiple-vehicle). The first part of this thesis is devoted to the optimal energy management for a single automated series HEV with consideration of engine start-stop system (SSS) under battery charge sustaining operation. A heuristic hysteresis power threshold strategy (HPTS) is proposed to optimise the fuel economy of an HEV with SSS and extra penalty fuel for engine restarts. By a systematic tuning process, the overall control performance of HPTS can be fully optimised for different vehicle parameters and driving cycles. In the second part, two energy-optimal control strategies via a model predictive control (MPC) framework are proposed for the vehicle following problem. To forecast the behaviour of the preceding vehicle, a neural network predictor is utilised and incorporated into a nonlinear MPC method, of which the fuel and computational efficiencies are verified to be effective through comparisons of numerical examples between a practical adaptive cruise control strategy and an impractical optimal control method. A robust MPC (RMPC) via linear matrix inequality (LMI) is also utilised to deal with the uncertainties existing in V2V communication and modelling errors. By conservative relaxation and approximation, the RMPC problem is formulated as a convex semi-definite program, and the simulation results prove the robustness of the RMPC and the rapid computational efficiency resorting to the convex optimisation. The final part focuses on the centralised and decentralised control frameworks at signal-free intersections, where the energy consumption and the crossing time of a group of CAVs are minimised. Their crossing order and velocity trajectories are optimised by convex second-order cone programs in a hierarchical scheme subject to safety constraints. It is shown that the centralised strategy with consideration of turning manoeuvres is effective and outperforms a benchmark solution invoking the widely used first-in-first-out policy. On the other hand, the decentralised method is proposed to further improve computational efficiency and enhance the system robustness via a tube-based RMPC. The numerical examples of both frameworks highlight the importance of examining the trade-off between energy consumption and travel time, as small compromises in travel time could produce significant energy savings.Open Acces

    Multiagent Deep Reinforcement Learning: Challenges and Directions Towards Human-Like Approaches

    Full text link
    This paper surveys the field of multiagent deep reinforcement learning. The combination of deep neural networks with reinforcement learning has gained increased traction in recent years and is slowly shifting the focus from single-agent to multiagent environments. Dealing with multiple agents is inherently more complex as (a) the future rewards depend on the joint actions of multiple players and (b) the computational complexity of functions increases. We present the most common multiagent problem representations and their main challenges, and identify five research areas that address one or more of these challenges: centralised training and decentralised execution, opponent modelling, communication, efficient coordination, and reward shaping. We find that many computational studies rely on unrealistic assumptions or are not generalisable to other settings; they struggle to overcome the curse of dimensionality or nonstationarity. Approaches from psychology and sociology capture promising relevant behaviours such as communication and coordination. We suggest that, for multiagent reinforcement learning to be successful, future research addresses these challenges with an interdisciplinary approach to open up new possibilities for more human-oriented solutions in multiagent reinforcement learning.Comment: 37 pages, 6 figure

    Models and optimisation methods for interference coordination in self-organising cellular networks

    Get PDF
    A thesis submitted for the degree of Doctor of PhilosophyWe are at that moment of network evolution when we have realised that our telecommunication systems should mimic features of human kind, e.g., the ability to understand the medium and take advantage of its changes. Looking towards the future, the mobile industry envisions the use of fully automatised cells able to self-organise all their parameters and procedures. A fully self-organised network is the one that is able to avoid human involvement and react to the fluctuations of network, traffic and channel through the automatic/autonomous nature of its functioning. Nowadays, the mobile community is far from this fully self-organised kind of network, but they are taken the first steps to achieve this target in the near future. This thesis hopes to contribute to the automatisation of cellular networks, providing models and tools to understand the behaviour of these networks, and algorithms and optimisation approaches to enhance their performance. This work focuses on the next generation of cellular networks, in more detail, in the DownLink (DL) of Orthogonal Frequency Division Multiple Access (OFDMA) based networks. Within this type of cellular system, attention is paid to interference mitigation in self-organising macrocell scenarios and femtocell deployments. Moreover, this thesis investigates the interference issues that arise when these two cell types are jointly deployed, complementing each other in what is currently known as a two-tier network. This thesis also provides new practical approaches to the inter-cell interference problem in both macro cell and femtocell OFDMA systems as well as in two-tier networks by means of the design of a novel framework and the use of mathematical optimisation. Special attention is paid to the formulation of optimisation problems and the development of well-performing solving methods (accurate and fast)
    • …
    corecore