4 research outputs found

    Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning

    Get PDF
    Planning for multi-agent systems such as task assignment for teams of limited-fuel unmanned aerial vehicles (UAVs) is challenging due to uncertainties in the assumed models and the very large size of the planning space. Researchers have developed fast cooperative planners based on simple models (e.g., linear and deterministic dynamics), yet inaccuracies in assumed models will impact the resulting performance. Learning techniques are capable of adapting the model and providing better policies asymptotically compared to cooperative planners, yet they often violate the safety conditions of the system due to their exploratory nature. Moreover they frequently require an impractically large number of interactions to perform well. This paper introduces the intelligent Cooperative Control Architecture (iCCA) as a framework for combining cooperative planners and reinforcement learning techniques. iCCA improves the policy of the cooperative planner, while reduces the risk and sample complexity of the learner. Empirical results in gridworld and task assignment for fuel-limited UAV domains with problem sizes up to 9 billion state-action pairs verify the advantage of iCCA over pure learning and planning strategies

    Stochastic network interdiction games

    Full text link
    Thesis (Ph.D.)--Boston UniversityNetwork interdiction problems consist of games between an attacker and an intelligent network, where the attacker seeks to degrade network operations while the network adapts its operations to counteract the effects of the attacker. This problem has received significant attention in recent years due to its relevance to military problems and network security. When the attacker's actions achieve uncertain effects, the resulting problems become stochastic network interdiction problems. In this thesis, we develop new algorithms for the solutions of different classes of stochastic network interdiction problems. We first focus on static network interdiction games where the attacker attacks the network once, which will change the network with certain probability. Then the network will maximize the flow from a given source to its destination. The attacker is seeking a strategy which minimizes the expected maximum flow after the attack. For this problem, we develop a new solution algorithm, based on parsimonious integration of branch and bound techniques with increasingly accurate lower bounds. Our method obtains solutions significantly faster than previous approaches in the literature. In the second part, we study a multi-stage interdiction problem where the attacker can attack the network multiple times, and observe the outcomes of its past attacks before selecting a current attack. For this dynamic interdiction game, we use a model-predictive approach based on a lower bound approximation. We develop a new set of performance bounds, which are integrated into a modified branch and bound procedure that extends the single stage approach to multiple stages. We show that our new algorithm is faster than other available methods with simulated experiments. In the last part, we study the nested information game between an intelligent network and an attacker, where the attacker has partial information about the network state, which refers to the availability of arcs. The attacker does not know the exact state, but has a probability distribution over the possible network states. The attacker makes several attempts to attack the network and observes the flows on the network. These observations will update the attacker's knowledge of the network and will be used in selecting the next attack actions. The defender can either send flow on that arc if it survived, or refrain from using it in order to deceive the attacker. For these problems, we develop a faster algorithm, which decomposes this game into a sequence of subgames and solves them to get the equilibrium strategy for the original game. Numerical results show that our method can handle large problems which other available methods fail to solve

    Optimization with Discrete Simultaneous Perturbation Stochastic Approximation Using Noisy Loss Function Measurements

    Get PDF
    Discrete stochastic optimization considers the problem of minimizing (or maximizing) loss functions defined on discrete sets, where only noisy measurements of the loss functions are available. The discrete stochastic optimization problem is widely applicable in practice, and many algorithms have been considered to solve this kind of optimization problem. Motivated by the efficient algorithm of simultaneous perturbation stochastic approximation (SPSA) for continuous stochastic optimization problems, we introduce the middle point discrete simultaneous perturbation stochastic approximation (DSPSA) algorithm for the stochastic optimization of a loss function defined on a p-dimensional grid of points in Euclidean space. We show that the sequence generated by DSPSA converges to the optimal point under some conditions. Consistent with other stochastic approximation methods, DSPSA formally accommodates noisy measurements of the loss function. We also show the rate of convergence analysis of DSPSA by solving an upper bound of the mean squared error of the generated sequence. In order to compare the performance of DSPSA with the other algorithms such as the stochastic ruler algorithm (SR) and the stochastic comparison algorithm (SC), we set up a bridge between DSPSA and the other two algorithms by comparing the probability in a big-O sense of not achieving the optimal solution. We show the theoretical and numerical comparison results of DSPSA, SR, and SC. In addition, we consider an application of DSPSA towards developing optimal public health strategies for containing the spread of influenza given limited societal resources

    Robust distributed planning strategies for autonomous multi-agent teams

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2012.Cataloged from department-submitted PDF version of thesis. This electronic version was submitted and approved by the author's academic department as part of an electronic thesis pilot project. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 225-244).The increased use of autonomous robotic agents, such as unmanned aerial vehicles (UAVs) and ground rovers, for complex missions has motivated the development of autonomous task allocation and planning methods that ensure spatial and temporal coordination for teams of cooperating agents. The basic problem can be formulated as a combinatorial optimization (mixed-integer program) involving nonlinear and time-varying system dynamics. For most problems of interest, optimal solution methods are computationally intractable (NP-Hard), and centralized planning approaches, which usually require high bandwidth connections with a ground station (e.g. to transmit received sensor data, and to dispense agent plans), are resource intensive and react slowly to local changes in dynamic environments. Distributed approximate algorithms, where agents plan individually and coordinate with each other locally through consensus protocols, can alleviate many of these issues and have been successfully used to develop real-time conflict-free solutions for heterogeneous networked teams. An important issue associated with autonomous planning is that many of the algorithms rely on underlying system models and parameters which are often subject to uncertainty. This uncertainty can result from many sources including: inaccurate modeling due to simplifications, assumptions, and/or parameter errors; fundamentally nondeterministic processes (e.g. sensor readings, stochastic dynamics); and dynamic local information changes. As discrepancies between the planner models and the actual system dynamics increase, mission performance typically degrades. The impact of these discrepancies on the overall quality of the plan is usually hard to quantify in advance due to nonlinear effects, coupling between tasks and agents, and interdependencies between system constraints. However, if uncertainty models of planning parameters are available, they can be leveraged to create robust plans that explicitly hedge against the inherent uncertainty given allowable risk thresholds. This thesis presents real-time robust distributed planning strategies that can be used to plan for multi-agent networked teams operating in stochastic and dynamic environments. One class of distributed combinatorial planning algorithms involves using auction algorithms augmented with consensus protocols to allocate tasks amongst a team of agents while resolving conflicting assignments locally between the agents. A particular algorithm in this class is the Consensus-Based Bundle Algorithm (CBBA), a distributed auction protocol that guarantees conflict-free solutions despite inconsistencies in situational awareness across the team. CBBA runs in polynomial time, demonstrating good scalability with increasing numbers of agents and tasks. This thesis builds upon the CBBA framework to address many realistic considerations associated with planning for networked teams, including time-critical mission constraints, limited communication between agents, and stochastic operating environments. A particular focus of this work is a robust extension to CBBA that handles distributed planning in stochastic environments given probabilistic parameter models and different stochastic metrics. The Robust CBBA algorithm proposed in this thesis provides a distributed real-time framework which can leverage different stochastic metrics to hedge against parameter uncertainty. In mission scenarios where low probability of failure is required, a chance-constrained stochastic metric can be used to provide probabilistic guarantees on achievable mission performance given allowable risk thresholds. This thesis proposes a distributed chance-constrained approximation that can be used within the Robust CBBA framework, and derives constraints on individual risk allocations to guarantee equivalence between the centralized chance-constrained optimization and the distributed approximation. Different risk allocation strategies for homogeneous and heterogeneous teams are proposed that approximate the agent and mission score distributions a priori, and results are provided showing improved performance in time-critical mission scenarios given allowable risk thresholds.by Sameera S. Ponda.Ph.D