3,247 research outputs found

    Data-Driven Risk-sensitive Model Predictive Control for Safe Navigation in Multi-Robot Systems

    Full text link
    Safe navigation is a fundamental challenge in multi-robot systems due to the uncertainty surrounding the future trajectory of the robots that act as obstacles for each other. In this work, we propose a principled data-driven approach where each robot repeatedly solves a finite horizon optimization problem subject to collision avoidance constraints with latter being formulated as distributionally robust conditional value-at-risk (CVaR) of the distance between the agent and a polyhedral obstacle geometry. Specifically, the CVaR constraints are required to hold for all distributions that are close to the empirical distribution constructed from observed samples of prediction error collected during execution. The generality of the approach allows us to robustify against prediction errors that arise under commonly imposed assumptions in both distributed and decentralized settings. We derive tractable finite-dimensional approximations of this class of constraints by leveraging convex and minmax duality results for Wasserstein distributionally robust optimization problems. The effectiveness of the proposed approach is illustrated in a multi-drone navigation setting implemented in Gazebo platform

    Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions

    Get PDF
    This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017

    Channel Selection for Network-assisted D2D Communication via No-Regret Bandit Learning with Calibrated Forecasting

    Full text link
    We consider the distributed channel selection problem in the context of device-to-device (D2D) communication as an underlay to a cellular network. Underlaid D2D users communicate directly by utilizing the cellular spectrum but their decisions are not governed by any centralized controller. Selfish D2D users that compete for access to the resources construct a distributed system, where the transmission performance depends on channel availability and quality. This information, however, is difficult to acquire. Moreover, the adverse effects of D2D users on cellular transmissions should be minimized. In order to overcome these limitations, we propose a network-assisted distributed channel selection approach in which D2D users are only allowed to use vacant cellular channels. This scenario is modeled as a multi-player multi-armed bandit game with side information, for which a distributed algorithmic solution is proposed. The solution is a combination of no-regret learning and calibrated forecasting, and can be applied to a broad class of multi-player stochastic learning problems, in addition to the formulated channel selection problem. Analytically, it is established that this approach not only yields vanishing regret (in comparison to the global optimal solution), but also guarantees that the empirical joint frequencies of the game converge to the set of correlated equilibria.Comment: 31 pages (one column), 9 figure

    Multi-Agent Chance-Constrained Stochastic Shortest Path with Application to Risk-Aware Intelligent Intersection

    Full text link
    In transportation networks, where traffic lights have traditionally been used for vehicle coordination, intersections act as natural bottlenecks. A formidable challenge for existing automated intersections lies in detecting and reasoning about uncertainty from the operating environment and human-driven vehicles. In this paper, we propose a risk-aware intelligent intersection system for autonomous vehicles (AVs) as well as human-driven vehicles (HVs). We cast the problem as a novel class of Multi-agent Chance-Constrained Stochastic Shortest Path (MCC-SSP) problems and devise an exact Integer Linear Programming (ILP) formulation that is scalable in the number of agents' interaction points (e.g., potential collision points at the intersection). In particular, when the number of agents within an interaction point is small, which is often the case in intersections, the ILP has a polynomial number of variables and constraints. To further improve the running time performance, we show that the collision risk computation can be performed offline. Additionally, a trajectory optimization workflow is provided to generate risk-aware trajectories for any given intersection. The proposed framework is implemented in CARLA simulator and evaluated under a fully autonomous intersection with AVs only as well as in a hybrid setup with a signalized intersection for HVs and an intelligent scheme for AVs. As verified via simulations, the featured approach improves intersection's efficiency by up to 200%200\% while also conforming to the specified tunable risk threshold

    Severity-sensitive norm-governed multi-agent planning

    Get PDF
    This research was funded by Selex ES. The software developed during this research, including the norm analysis and planning algorithms, the simulator and harbour protection scenario used during evaluation is freely available from doi:10.5258/SOTON/D0139Peer reviewedPublisher PD

    Review of trends and targets of complex systems for power system optimization

    Get PDF
    Optimization systems (OSs) allow operators of electrical power systems (PS) to optimally operate PSs and to also create optimal PS development plans. The inclusion of OSs in the PS is a big trend nowadays, and the demand for PS optimization tools and PS-OSs experts is growing. The aim of this review is to define the current dynamics and trends in PS optimization research and to present several papers that clearly and comprehensively describe PS OSs with characteristics corresponding to the identified current main trends in this research area. The current dynamics and trends of the research area were defined on the basis of the results of an analysis of the database of 255 PS-OS-presenting papers published from December 2015 to July 2019. Eleven main characteristics of the current PS OSs were identified. The results of the statistical analyses give four characteristics of PS OSs which are currently the most frequently presented in research papers: OSs for minimizing the price of electricity/OSs reducing PS operation costs, OSs for optimizing the operation of renewable energy sources, OSs for regulating the power consumption during the optimization process, and OSs for regulating the energy storage systems operation during the optimization process. Finally, individual identified characteristics of the current PS OSs are briefly described. In the analysis, all PS OSs presented in the observed time period were analyzed regardless of the part of the PS for which the operation was optimized by the PS OS, the voltage level of the optimized PS part, or the optimization goal of the PS OS.Web of Science135art. no. 107
    corecore