2,791 research outputs found

    Centralized and distributed algorithms for on-line synthesis of maximal control policies under partial observation

    Full text link
    This paper deals with the on-line control of partially observed discrete event systems (DES). The goal is to restrict the behavior of the system within a prefix-closed legal language while accounting for the presence of uncontrollable and unobservable events. In the spirit of recent work on the on-line control of partially observed DES (Heymann and Lin 1994) and on variable lookahead control of fully observed DES (Ben Hadj-Alouane et al. 1994c), we propose an approach where, following each observable event, a control action is computed on-line using an algorithm of linear worst-case complexity. This algorithm, called VLP-PO , has the following additional properties: (i) the resulting behavior is guaranteed to be a maximal controllable and observable sublanguage of the legal language; (ii) different maximals may be generated by varying the priorities assigned to the controllable events, a parameter of VLP-PO ; (iii) a maximal containing the supremal controllable and normal sublanguage of the legal language can be generated by a proper selection of controllable event priorities; and (iv) no off-line calculations are necessary. We also present a parallel/distributed version of the VLP-PO algorithm called DI-VLP-PO . This version uses several communicating agents that simultaneously run (on-line) identical versions of the algorithm but on possibly different parts of the system model and the legal language, according to the structural properties of the system and the specifications. While achieving the same behavior as VLO-PO, DI-VLP-PO runs at a total complexity (for computation and communication) that is significantly lower than its sequential counterpart.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/45126/1/10626_2005_Article_BF01797138.pd

    Distributed reinforcement learning for self-reconfiguring modular robots

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 101-106).In this thesis, we study distributed reinforcement learning in the context of automating the design of decentralized control for groups of cooperating, coupled robots. Specifically, we develop a framework and algorithms for automatically generating distributed controllers for self-reconfiguring modular robots using reinforcement learning. The promise of self-reconfiguring modular robots is that of robustness, adaptability and versatility. Yet most state-of-the-art distributed controllers are laboriously handcrafted and task-specific, due to the inherent complexities of distributed, local-only control. In this thesis, we propose and develop a framework for using reinforcement learning for automatic generation of such controllers. The approach is profitable because reinforcement learning methods search for good behaviors during the lifetime of the learning agent, and are therefore applicable to online adaptation as well as automatic controller design. However, we must overcome the challenges due to the fundamental partial observability inherent in a distributed system such as a self reconfiguring modular robot. We use a family of policy search methods that we adapt to our distributed problem. The outcome of a local search is always influenced by the search space dimensionality, its starting point, and the amount and quality of available exploration through experience.(cont) We undertake a systematic study of the effects that certain robot and task parameters, such as the number of modules, presence of exploration constraints, availability of nearest-neighbor communications, and partial behavioral knowledge from previous experience, have on the speed and reliability of learning through policy search in self-reconfiguring modular robots. In the process, we develop novel algorithmic variations and compact search space representations for learning in our domain, which we test experimentally on a number of tasks. This thesis is an empirical study of reinforcement learning in a simulated lattice based self-reconfiguring modular robot domain. However, our results contribute to the broader understanding of automatic generation of group control and design of distributed reinforcement learning algorithms.by Paulina Varshavskaya.Ph.D

    GEM: a Distributed Goal Evaluation Algorithm for Trust Management

    Full text link
    Trust management is an approach to access control in distributed systems where access decisions are based on policy statements issued by multiple principals and stored in a distributed manner. In trust management, the policy statements of a principal can refer to other principals' statements; thus, the process of evaluating an access request (i.e., a goal) consists of finding a "chain" of policy statements that allows the access to the requested resource. Most existing goal evaluation algorithms for trust management either rely on a centralized evaluation strategy, which consists of collecting all the relevant policy statements in a single location (and therefore they do not guarantee the confidentiality of intensional policies), or do not detect the termination of the computation (i.e., when all the answers of a goal are computed). In this paper we present GEM, a distributed goal evaluation algorithm for trust management systems that relies on function-free logic programming for the specification of policy statements. GEM detects termination in a completely distributed way without disclosing intensional policies, thereby preserving their confidentiality. We demonstrate that the algorithm terminates and is sound and complete with respect to the standard semantics for logic programs.Comment: To appear in Theory and Practice of Logic Programming (TPLP

    Formal methods for motion planning and control in dynamic and partially known environments

    Full text link
    This thesis is motivated by time and safety critical applications involving the use of autonomous vehicles to accomplish complex tasks in dynamic and partially known environments. We use temporal logic to formally express such complex tasks. Temporal logic specifications generalize the classical notions of stability and reachability widely studied within the control and hybrid systems communities. Given a model describing the motion of a robotic system in an environment and a formal task specification, the aim is to automatically synthesize a control policy that guarantees the satisfaction of the specification. This thesis presents novel control synthesis algorithms to tackle the problem of motion planning from temporal logic specifications in uncertain environments. For each one of the planning and control synthesis problems addressed in this dissertation, the proposed algorithms are implemented, evaluated, and validated thought experiments and/or simulations. The first part of this thesis focuses on a mobile robot whose success is measured by the completion of temporal logic tasks within a given period of time. In addition to such time constraints, the planning algorithm must also deal with the uncertainty that arises from the changes in the robot's workspace during task execution. In particular, we consider a robot deployed in a partitioned environment subjected to structural changes such as doors that can open and close. The motion of the robot is modeled as a continuous time Markov decision process and the robot's mission is expressed as a Continuous Stochastic Logic (CSL) formula. A complete framework to find a control strategy that satisfies a specification given as a CSL formula is introduced. The second part of this thesis addresses the synthesis of controllers that guarantee the satisfaction of a task specification expressed as a syntactically co-safe Linear Temporal Logic (scLTL) formula. In this case, uncertainty is characterized by the partial knowledge of the robot's environment. Two scenarios are considered. First, a distributed team of robots required to satisfy the specification over a set of service requests occurring at the vertices of a known graph representing the environment is examined. Second, a single agent motion planning problem from the specification over a set of properties known to be satised at the vertices of the known graph environment is studied. In both cases, we exploit the existence of o-the-shelf model checking and runtime verification tools, the efficiency of graph search algorithms, and the efficacy of exploration techniques to solve the motion planning problem constrained by the absence of complete information about the environment. The final part of this thesis extends uncertainty beyond the absence of a complete knowledge of the environment described above by considering a robot equipped with a noisy sensing system. In particular, the robot is tasked with satisfying a scLTL specification over a set of regions of interest known to be present in the environment. In such a case, although the robot is able to measure the properties characterizing such regions of interest, precisely determining the identity of these regions is not feasible. A mixed observability Markov decision process is used to represent the robot's actuation and sensing models. The control synthesis problem from scLTL formulas is then formulated as a maximum probability reachability problem on this model. The integration of dynamic programming, formal methods, and frontier-based exploration tools allow us to derive an algorithm to solve such a reachability problem

    Formal methods for resilient control

    Get PDF
    Many systems operate in uncertain, possibly adversarial environments, and their successful operation is contingent upon satisfying specific requirements, optimal performance, and ability to recover from unexpected situations. Examples are prevalent in many engineering disciplines such as transportation, robotics, energy, and biological systems. This thesis studies designing correct, resilient, and optimal controllers for discrete-time complex systems from elaborate, possibly vague, specifications. The first part of the contributions of this thesis is a framework for optimal control of non-deterministic hybrid systems from specifications described by signal temporal logic (STL), which can express a broad spectrum of interesting properties. The method is optimization-based and has several advantages over the existing techniques. When satisfying the specification is impossible, the degree of violation - characterized by STL quantitative semantics - is minimized. The computational limitations are discussed. The focus of second part is on specific types of systems and specifications for which controllers are synthesized efficiently. A class of monotone systems is introduced for which formal synthesis is scalable and almost complete. It is shown that hybrid macroscopic traffic models fall into this class. Novel techniques in modular verification and synthesis are employed for distributed optimal control, and their usefulness is shown for large-scale traffic management. Apart from monotone systems, a method is introduced for robust constrained control of networked linear systems with communication constraints. Case studies on longitudinal control of vehicular platoons are presented. The third part is about learning-based control with formal guarantees. Two approaches are studied. First, a formal perspective on adaptive control is provided in which the model is represented by a parametric transition system, and the specification is captured by an automaton. A correct-by-construction framework is developed such that the controller infers the actual parameters and plans accordingly for all possible future transitions and inferences. The second approach is based on hybrid model identification using input-output data. By assuming some limited knowledge of the range of system behaviors, theoretical performance guarantees are provided on implementing the controller designed for the identified model on the original unknown system

    Coordinated Robot Navigation via Hierarchical Clustering

    Get PDF
    We introduce the use of hierarchical clustering for relaxed, deterministic coordination and control of multiple robots. Traditionally an unsupervised learning method, hierarchical clustering offers a formalism for identifying and representing spatially cohesive and segregated robot groups at different resolutions by relating the continuous space of configurations to the combinatorial space of trees. We formalize and exploit this relation, developing computationally effective reactive algorithms for navigating through the combinatorial space in concert with geometric realizations for a particular choice of hierarchical clustering method. These constructions yield computationally effective vector field planners for both hierarchically invariant as well as transitional navigation in the configuration space. We apply these methods to the centralized coordination and control of nn perfectly sensed and actuated Euclidean spheres in a dd-dimensional ambient space (for arbitrary nn and dd). Given a desired configuration supporting a desired hierarchy, we construct a hybrid controller which is quadratic in nn and algebraic in dd and prove that its execution brings all but a measure zero set of initial configurations to the desired goal with the guarantee of no collisions along the way.Comment: 29 pages, 13 figures, 8 tables, extended version of a paper in preparation for submission to a journa
    • …
    corecore