3,297 research outputs found

    Resilient Autonomous Control of Distributed Multi-agent Systems in Contested Environments

    Full text link
    An autonomous and resilient controller is proposed for leader-follower multi-agent systems under uncertainties and cyber-physical attacks. The leader is assumed non-autonomous with a nonzero control input, which allows changing the team behavior or mission in response to environmental changes. A resilient learning-based control protocol is presented to find optimal solutions to the synchronization problem in the presence of attacks and system dynamic uncertainties. An observer-based distributed H_infinity controller is first designed to prevent propagating the effects of attacks on sensors and actuators throughout the network, as well as to attenuate the effect of these attacks on the compromised agent itself. Non-homogeneous game algebraic Riccati equations are derived to solve the H_infinity optimal synchronization problem and off-policy reinforcement learning is utilized to learn their solution without requiring any knowledge of the agent's dynamics. A trust-confidence based distributed control protocol is then proposed to mitigate attacks that hijack the entire node and attacks on communication links. A confidence value is defined for each agent based solely on its local evidence. The proposed resilient reinforcement learning algorithm employs the confidence value of each agent to indicate the trustworthiness of its own information and broadcast it to its neighbors to put weights on the data they receive from it during and after learning. If the confidence value of an agent is low, it employs a trust mechanism to identify compromised agents and remove the data it receives from them from the learning process. Simulation results are provided to show the effectiveness of the proposed approach

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions

    Get PDF
    This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017

    Multiagent-Based Control for Plug-and-Play Batteries in DC Microgrids with Infrastructure Compensation

    Get PDF
    The influence of the DC infrastructure on the control of power-storage flow in micro- and smart grids has gained attention recently, particularly in dynamic vehicle-to-grid charging applications. Principal effects include the potential loss of the charge–discharge synchronization and the subsequent impact on the control stabilization, the increased degradation in batteries’ health/life, and resultant power- and energy-efficiency losses. This paper proposes and tests a candidate solution to compensate for the infrastructure effects in a DC microgrid with a varying number of heterogeneous battery storage systems in the context of a multiagent neighbor-to-neighbor control scheme. Specifically, the scheme regulates the balance of the batteries’ load-demand participation, with adaptive compensation for unknown and/or time-varying DC infrastructure influences. Simulation and hardware-in-the-loop studies in realistic conditions demonstrate the improved precision of the charge–discharge synchronization and the enhanced balance of the output voltage under 24 h excessively continuous variations in the load demand. In addition, immediate real-time compensation for the DC infrastructure influence can be attained with no need for initial estimates of key unknown parameters. The results provide both the validation and verification of the proposals under real operational conditions and expectations, including the dynamic switching of the heterogeneous batteries’ connection (plug-and-play) and the variable infrastructure influences of different dynamically switched branches. Key observed metrics include an average reduced convergence time (0.66–13.366%), enhanced output-voltage balance (2.637–3.24%), power-consumption reduction (3.569–4.93%), and power-flow-balance enhancement (2.755–6.468%), which can be achieved for the proposed scheme over a baseline for the experiments in question.</p
    corecore