9 research outputs found

    Control Synthesis for Cyber-Physical Systems to Satisfy Metric Interval Temporal Logic Objectives under Timing and Actuator Attacks

    Full text link
    This paper studies the synthesis of controllers for cyber-physical systems (CPSs) that are required to carry out complex tasks that are time-sensitive, in the presence of an adversary. The task is specified as a formula in metric interval temporal logic (MITL). The adversary is assumed to have the ability to tamper with the control input to the CPS and also manipulate timing information perceived by the CPS. In order to model the interaction between the CPS and the adversary, and also the effect of these two classes of attacks, we define an entity called a durational stochastic game (DSG). DSGs probabilistically capture transitions between states in the environment, and also the time taken for these transitions. With the policy of the defender represented as a finite state controller (FSC), we present a value-iteration based algorithm that computes an FSC that maximizes the probability of satisfying the MITL specification under the two classes of attacks. A numerical case-study on a signalized traffic network is presented to illustrate our results

    Processos de Decisão de Markov: um tutorial

    Get PDF
    Há situações em que decisões devem ser tomadas em seqüência, e o resultado de cada decisão não é claro para o tomador de decisões. Estas situações podem ser formuladas matematicamente como processos de decisão de Markov, e dadas as probabilidades dos valores resultantes das decisões, é possível determinar uma política que maximize o valor esperado da seqüência de decisões. Este tutorial descreve os processos de decisão de Markov (tanto o caso completamente observável como o parcialmente observável) e discute brevemente alguns métodos para a sua solução. Processos semi-Markovianos não são discutidos

    Compact parametric models for efficient sequential decision making in high-dimensional, uncertain domains

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 137-144).Within artificial intelligence and robotics there is considerable interest in how a single agent can autonomously make sequential decisions in large, high-dimensional, uncertain domains. This thesis presents decision-making algorithms for maximizing the expected sum of future rewards in two types of large, high-dimensional, uncertain situations: when the agent knows its current state but does not have a model of the world dynamics within a Markov decision process (MDP) framework, and in partially observable Markov decision processes (POMDPs), when the agent knows the dynamics and reward models, but only receives information about its state through its potentially noisy sensors. One of the key challenges in the sequential decision making field is the tradeoff between optimality and tractability. To handle high-dimensional (many variables), large (many potential values per variable) domains, an algorithm must have a computational complexity that scales gracefully with the number of dimensions. However, many prior approaches achieve such scalability through the use of heuristic methods with limited or no guarantees on how close to optimal, and under what circumstances, are the decisions made by the algorithm. Algorithms that do provide rigorous optimality bounds often do so at the expense of tractability. This thesis proposes that the use of parametric models of the world dynamics, rewards and observations can enable efficient, provably close to optimal, decision making in large, high-dimensional uncertain environments.(cont.) In support of this, we present a reinforcement learning (RL) algorithm where the use of a parametric model allows the algorithm to make close to optimal decisions on all but a number of samples that scales polynomially with the dimension, a significant improvement over most prior RL provably approximately optimal algorithms. We also show that parametric models can be used to reduce the computational complexity from an exponential to polynomial dependence on the state dimension in forward search partially observable MDP planning. Under mild conditions our new forward-search POMDP planner maintains prior optimality guarantees on the resulting decisions. We present experimental results on a robot navigation over varying terrain RL task and a large global driving POMDP planning simulation.by Emma Patricia Brunskill.Ph.D

    Efficient planning under uncertainty with macro-actions

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 163-168).Planning in large, partially observable domains is challenging, especially when good performance requires considering situations far in the future. Existing planners typically construct a policy by performing fully conditional planning, where each future action is conditioned on a set of possible observations that could be obtained at every timestep. Unfortunately, fully-conditional planning can be computationally expensive, and state-of-the-art solvers are either limited in the size of problems that can be solved, or can only plan out to a limited horizon. We propose that for a large class of real-world, planning under uncertainty problems, it is necessary to perform far-lookahead decision-making, but unnecessary to construct policies that condition all actions on observations obtained at the previous timestep. Instead, these problems can be solved by performing semi conditional planning, where the constructed policy only conditions actions on observations at certain key points. Between these key points, the policy assumes that a macro-action - a temporally-extended, fixed length, open-loop action sequence, comprising a series of primitive actions, is executed. These macro-actions are evaluated within a forward-search framework, which only considers beliefs that are reachable from the agent's current belief under different actions and observations; a belief summarizes an agent's past history of actions and observations. Together, semi-conditional planning in a forward search manner restricts the policy space in exchange for conditional planning out to a longer-horizon. Two technical challenges have to be overcome in order to perform semi-conditional planning efficiently - how the macro-actions can be automatically generated, as well as how to efficiently incorporate the macro action into the forward search framework. We propose an algorithm which automatically constructs the macro-actions that are evaluated within a forward search planning framework, iteratively refining the macro actions as more computation time is made available for planning. In addition, we show that for a subset of problem domains, it is possible to analytically compute the distribution over posterior beliefs that result from a single macro-action. This ability to directly compute a distribution over posterior beliefs enables us to enjoy computational savings when performing macro-action forward search. Performance and computational analysis for the algorithms proposed in this thesis are presented, as well as simulation experiments that demonstrate superior performance relative to existing state-of-the-art solvers on large planning under uncertainty domains. We also demonstrate our planning under uncertainty algorithms on target-tracking applications for an actual autonomous helicopter, highlighting the practical potential for planning in real-world, long-horizon, partially observable domains.by Ruijie He.Ph.D

    Synthesis of hierarchical finite-state controllers for POMDPs

    No full text
    We develop a hierarchical approach to planning for partially observable Markov decision processes (POMDPs) in which a policy is represented as a hierarchical finite-state controller. To provide a foundation for this approach, we discuss some extensions of the POMDP framework that allow us to formalize the process of abstraction by which a hierarchical controller is constructed. We describe a planning algorithm that uses a programmer-defined task hierarchy to constrain the search space of finite-state controllers, and prove that this algorithm converges to a hierarchical finite-state controller that is ε-optimal in a limited but well-defined sense, related to the concept of recursive optimality