18 research outputs found

    Control and Coordination in Hierarchical Systems

    Get PDF
    This book presents the applied theory of control and cooordination in hierarchical systems which are those where decision making has been divided in a certain way. It concentrates on various aspects of optimal control in large scale systems and covers a range of topics from multilevel methods for optimizing by interactive feedback procedures to methods for sequential, hierarchical control in large dynamic systems

    Large scale dynamic systems

    Get PDF
    Classes of large scale dynamic systems were discussed in the context of modern control theory. Specific examples discussed were in the technical fields of aeronautics, water resources and electric power

    Techniques for the allocation of resources under uncertainty

    Get PDF
    L’allocation de ressources est un problĂšme omniprĂ©sent qui survient dĂšs que des ressources limitĂ©es doivent ĂȘtre distribuĂ©es parmi de multiples agents autonomes (e.g., personnes, compagnies, robots, etc). Les approches standard pour dĂ©terminer l’allocation optimale souffrent gĂ©nĂ©ralement d’une trĂšs grande complexitĂ© de calcul. Le but de cette thĂšse est de proposer des algorithmes rapides et efficaces pour allouer des ressources consommables et non consommables Ă  des agents autonomes dont les prĂ©fĂ©rences sur ces ressources sont induites par un processus stochastique. Afin d’y parvenir, nous avons dĂ©veloppĂ© de nouveaux modĂšles pour des problĂšmes de planifications, basĂ©s sur le cadre des Processus DĂ©cisionnels de Markov (MDPs), oĂč l’espace d’actions possibles est explicitement paramĂ©trisĂ©s par les ressources disponibles. Muni de ce cadre, nous avons dĂ©veloppĂ© des algorithmes basĂ©s sur la programmation dynamique et la recherche heuristique en temps-rĂ©el afin de gĂ©nĂ©rer des allocations de ressources pour des agents qui agissent dans un environnement stochastique. En particulier, nous avons utilisĂ© la propriĂ©tĂ© acyclique des crĂ©ations de tĂąches pour dĂ©composer le problĂšme d’allocation de ressources. Nous avons aussi proposĂ© une stratĂ©gie de dĂ©composition approximative, oĂč les agents considĂšrent des interactions positives et nĂ©gatives ainsi que les actions simultanĂ©es entre les agents gĂ©rants les ressources. Cependant, la majeure contribution de cette thĂšse est l’adoption de la recherche heuristique en temps-rĂ©el pour l’allocation de ressources. À cet effet, nous avons dĂ©veloppĂ© une approche basĂ©e sur la Q-dĂ©composition munie de bornes strictes afin de diminuer drastiquement le temps de planification pour formuler une politique optimale. Ces bornes strictes nous ont permis d’élaguer l’espace d’actions pour les agents. Nous montrons analytiquement et empiriquement que les approches proposĂ©es mĂšnent Ă  des diminutions de la complexitĂ© de calcul par rapport Ă  des approches de planification standard. Finalement, nous avons testĂ© la recherche heuristique en temps-rĂ©el dans le simulateur SADM, un simulateur d’allocation de ressource pour une frĂ©gate.Resource allocation is an ubiquitous problem that arises whenever limited resources have to be distributed among multiple autonomous entities (e.g., people, companies, robots, etc). The standard approaches to determine the optimal resource allocation are computationally prohibitive. The goal of this thesis is to propose computationally efficient algorithms for allocating consumable and non-consumable resources among autonomous agents whose preferences for these resources are induced by a stochastic process. Towards this end, we have developed new models of planning problems, based on the framework of Markov Decision Processes (MDPs), where the action sets are explicitly parameterized by the available resources. Given these models, we have designed algorithms based on dynamic programming and real-time heuristic search to formulating thus allocations of resources for agents evolving in stochastic environments. In particular, we have used the acyclic property of task creation to decompose the problem of resource allocation. We have also proposed an approximative decomposition strategy, where the agents consider positive and negative interactions as well as simultaneous actions among the agents managing the resources. However, the main contribution of this thesis is the adoption of stochastic real-time heuristic search for a resource allocation. To this end, we have developed an approach based on distributed Q-values with tight bounds to diminish drastically the planning time to formulate the optimal policy. These tight bounds enable to prune the action space for the agents. We show analytically and empirically that our proposed approaches lead to drastic (in many cases, exponential) improvements in computational efficiency over standard planning methods. Finally, we have tested real-time heuristic search in the SADM simulator, a simulator for the resource allocation of a platform

    Models and algorithms for multi-agent search problems

    Full text link
    The problem of searching for objects of interest occurs in important applications ranging from rescue, security, transportation, to medicine. With the increasing use of autonomous vehicles as search platforms, there is a need for fast algorithms that can generate search plans for multiple agents in response to new information. In this dissertation, we develop new techniques for automated generation of search plans for different classes of search problems. First, we study the problem of searching for a stationary object in a discrete search space with multiple agents where each agent can access only a subset of the search space. In these problems, agents can fail to detect an object when inspecting a location. We show that when the probabilities of detection only depend on the locations, this problem can be reformulated as a minimum cost network optimization problem, and develop a fast specialized algorithm for the solution. We prove that our algorithm finds the optimal solution in finite time, and has worst-case computation performance that is faster than general minimum cost flow algorithms. We then generalize it to the case where the probabilities of detection depend on the agents and the locations, and propose a greedy algorithm that is 1/2-approximate. Second, we study the problem of searching for a moving object in a discrete search space with multiple agents where each agent can access only a subset of a discrete search space at any time and agents can fail to detect objects when searching a location at a given time. We provide necessary conditions for an optimal search plan, extending prior results in search theory. For the case where the probabilities of detection depend on the locations and the time periods, we develop a forward-backward iterative algorithm based on coordinate descent techniques to obtain solutions. To avoid local optimum, we derive a convex relaxation of the dynamic search problem and show this can be solved optimally using coordinate descent techniques. The solutions of the relaxed problem are used to provide random starting conditions for the iterative algorithm. We also address the problem where the probabilities of detection depend on the agents as well as the locations and the time periods, and show that a greedy-style algorithm is 1/2-approximate. Third, we study problems when multiple objects of interest being searched are physically scattered among locations on a graph and the agents are subject to motion constraints captured by the graph edges as well as budget constraints. We model such problem as an orienteering problem, when searching with a single agent, or a team orienteering problem, when searching with multiple agents. We develop novel real-time efficient algorithms for both problems. Fourth, we investigate classes of continuous-region multi-agent adaptive search problems as stochastic control problems with imperfect information. We allow the agent measurement errors to be either correlated or independent across agents. The structure of these problems, with objectives related to information entropy, allows for a complete characterization of the optimal strategies and the optimal cost. We derive a lower bound on the performance of the minimum mean-square error estimator, and provide upper bounds on the estimation error for special cases. For agents with independent errors, we show that the optimal sensing strategies can be obtained in terms of the solution of decoupled scalar convex optimization problems, followed by a joint region selection procedure. We further consider search of multiple objects and provide an explicit construction for adaptively determining the sensing actions

    Gradient Methods for Large-Scale and Distributed Linear Quadratic Control

    Get PDF
    This thesis considers methods for synthesis of linear quadratic controllers for large-scale, interconnected systems. Conventional methods that solve the linear quadratic control problem are only applicable to systems with moderate size, due to the rapid increase in both computational time and memory requirements as the system size increases. The methods presented in this thesis show a much slower increase in these requirements when faced with system matrices with a sparse structure. Hence, they are useful for control design for systems of large order, since they usually have sparse systems matrices. An equally important feature of the methods is that the controllers are restricted to have a distributed nature, meaning that they respect a potential interconnection structure of the system. The controllers considered in the thesis have the same structure as the centralized LQG solution, that is, they are consisting of a state predictor and feedback from the estimated states. Strategies for determining the feedback matrix and predictor matrix separately, are suggested. The strategies use gradient directions of the cost function to iteratively approach a locally optimal solution in either problem. A scheme to determine bounds on the degree of suboptimality of the partial solution in every iteration, is presented. It is also shown that these bounds can be combined to give a bound on the degree of suboptimality of the full output feedback controller. Another method that treats the synthesis of the feedback matrix and predictor matrix simultaneously is also presented. The functionality of the developed methods is illustrated by an application, where the methods are used to compute controllers for a large deformable mirror, found in a telescope to compensate for atmospheric disturbances. The model of the mirror is obtained by discretizing a partial differential equation. This gives a linear, sparse representation of the mirror with a very large state space, which is suitable for the methods presented in the thesis. The performance of the controllers is evaluated using performance measures from the adaptive optics community

    Partially Observable Multi-agent RL with (Quasi-)Efficiency: The Blessing of Information Sharing

    Full text link
    We study provable multi-agent reinforcement learning (MARL) in the general framework of partially observable stochastic games (POSGs). To circumvent the known hardness results and the use of computationally intractable oracles, we advocate leveraging the potential \emph{information-sharing} among agents, a common practice in empirical MARL, and a standard model for multi-agent control systems with communications. We first establish several computation complexity results to justify the necessity of information-sharing, as well as the observability assumption that has enabled quasi-efficient single-agent RL with partial observations, for computational efficiency in solving POSGs. We then propose to further \emph{approximate} the shared common information to construct an {approximate model} of the POSG, in which planning an approximate equilibrium (in terms of solving the original POSG) can be quasi-efficient, i.e., of quasi-polynomial-time, under the aforementioned assumptions. Furthermore, we develop a partially observable MARL algorithm that is both statistically and computationally quasi-efficient. We hope our study may open up the possibilities of leveraging and even designing different \emph{information structures}, for developing both sample- and computation-efficient partially observable MARL.Comment: International Conference on Machine Learning (ICML) 202

    Multi-Agent Distributed Optimization and Estimation over Lossy Networks

    Get PDF
    Nowadays, optimization is a pervasive tool, employed in a lot different ïŹelds. Due to its ïŹ‚exibility, it can be used to solve many diverse problems, some of which do not seem to require an optimization framework. As so, the research on this topic is always active and copious. Another very interesting and current investigation ïŹeld involves multi-agent systems, that is, systems composed by a lot of (possibly different) agents. The research on cyber-physical systems, believed to be one of the challenges of the 21st century, is very extensive, and comprises very complex systems like smart cities and smart power-grids, but also much more simple ones, like wireless sensor networks or camera networks. In a multi-agent context, the optimization framework is extensively used. As a consequence, optimization in multi-agent systems is an attractive topic to investigate. The contents of this thesis focus on distributed optimization within a multi-agent scenario, i.e., optimization performed by a set of peers, among which there is no leader. Accordingly, when these agents have to perform a task, formulated as an optimization problem, they have to collaborate to solve it, all using the same kind of update rule. Collaboration clearly implies the need of messages exchange among the agents, and the focus of the thesis is on the criticalities related to the communication step. In particular, no reliability of this step is assumed, meaning that the packets exchanged between two agents can sometime be lost. Also, the sought-for solution does not have to employ an acknowledge protocol, that is, when an agent has to send a packet, it just sends it and goes on with its computation, without waiting for a conïŹrmation that the receiver has actually received the packet. Almost all works in the existing literature deal with packet losses employing an acknowledge (ACK) system; the effort in this thesis is to avoid the use of an ACK system, since it can slow down the communication step. However, this choice of averting the use of ACKs makes the development of optimization algorithms, and especially their convergence proof, more involved. Apart from robustness to packet losses, the algorithms developed in this dissertation are also asynchronous, that is, the agents do not need to be synchronized to perform the update and communication steps. Three types of optimization problems are analyzed in the thesis. The ïŹrst one is the patrolling problem for camera networks. The algorithm developed to solve this problem has a restricted applicability, since it is very task-dependent. The other two problems are more general, because both concern the minimization of the sum of cost functions, one for each agent in the system. In the ïŹrst case, the form of the local cost functions is particular: these, in fact, are locally coupled, in the sense that the cost function of an agent depends on the variables of the agent itself and on those of its direct neighbors. The sought-for algorithm has to satisfy two properties (apart from asynchronicity and robustness to packet losses): the requirement of asking a single communication exchange per iteration (which also reduces the need of synchronicity) and the requirement that the communication among agents is only between direct neighbors. In the second case, the local functions depend all on the same variables. The analysis ïŹrst focuses on the special case of local quadratic cost functions and their strong relationship with the consensus problem. Besides the development of a robust and asynchronous algorithm for the average consensus problem, a comparison among algorithms to solve the minimization of the sum of quadratic cost functions is carried out. Finally, the distributed minimization of the sum of more general local cost functions is tackled, leading to the development of a robust version of the Newton-Raphson consensus. The theoretical tools employed in the thesis to prove convergence of the algorithms mainly rely on Lyapunov theory and the separation of scales theory

    Automated decision making and problem solving. Volume 2: Conference presentations

    Get PDF
    Related topics in artificial intelligence, operations research, and control theory are explored. Existing techniques are assessed and trends of development are determined

    Discrete-Time Model Predictive Control

    Get PDF
    corecore