3,715 research outputs found
Aggregation and Control of Populations of Thermostatically Controlled Loads by Formal Abstractions
This work discusses a two-step procedure, based on formal abstractions, to
generate a finite-space stochastic dynamical model as an aggregation of the
continuous temperature dynamics of a homogeneous population of Thermostatically
Controlled Loads (TCL). The temperature of a single TCL is described by a
stochastic difference equation and the TCL status (ON, OFF) by a deterministic
switching mechanism. The procedure is formal as it allows the exact
quantification of the error introduced by the abstraction -- as such it builds
and improves on a known, earlier approximation technique in the literature.
Further, the contribution discusses the extension to the case of a
heterogeneous population of TCL by means of two approaches resulting in the
notion of approximate abstractions. It moreover investigates the problem of
global (population-level) regulation and load balancing for the case of TCL
that are dependent on a control input. The procedure is tested on a case study
and benchmarked against the mentioned alternative approach in the literature.Comment: 40 pages, 21 figures; the paper generalizes the result of conference
publication: S. Esmaeil Zadeh Soudjani and A. Abate, "Aggregation of
Thermostatically Controlled Loads by Formal Abstractions," Proceedings of the
European Control Conference 2013, pp. 4232-4237. version 2: added references
for section
Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
This paper presents the MAXQ approach to hierarchical reinforcement learning
based on decomposing the target Markov decision process (MDP) into a hierarchy
of smaller MDPs and decomposing the value function of the target MDP into an
additive combination of the value functions of the smaller MDPs. The paper
defines the MAXQ hierarchy, proves formal results on its representational
power, and establishes five conditions for the safe use of state abstractions.
The paper presents an online model-free learning algorithm, MAXQ-Q, and proves
that it converges wih probability 1 to a kind of locally-optimal policy known
as a recursively optimal policy, even in the presence of the five kinds of
state abstraction. The paper evaluates the MAXQ representation and MAXQ-Q
through a series of experiments in three domains and shows experimentally that
MAXQ-Q (with state abstractions) converges to a recursively optimal policy much
faster than flat Q learning. The fact that MAXQ learns a representation of the
value function has an important benefit: it makes it possible to compute and
execute an improved, non-hierarchical policy via a procedure similar to the
policy improvement step of policy iteration. The paper demonstrates the
effectiveness of this non-hierarchical execution experimentally. Finally, the
paper concludes with a comparison to related work and a discussion of the
design tradeoffs in hierarchical reinforcement learning.Comment: 63 pages, 15 figure
Transient Reward Approximation for Continuous-Time Markov Chains
We are interested in the analysis of very large continuous-time Markov chains
(CTMCs) with many distinct rates. Such models arise naturally in the context of
reliability analysis, e.g., of computer network performability analysis, of
power grids, of computer virus vulnerability, and in the study of crowd
dynamics. We use abstraction techniques together with novel algorithms for the
computation of bounds on the expected final and accumulated rewards in
continuous-time Markov decision processes (CTMDPs). These ingredients are
combined in a partly symbolic and partly explicit (symblicit) analysis
approach. In particular, we circumvent the use of multi-terminal decision
diagrams, because the latter do not work well if facing a large number of
different rates. We demonstrate the practical applicability and efficiency of
the approach on two case studies.Comment: Accepted for publication in IEEE Transactions on Reliabilit
Experimental Biological Protocols with Formal Semantics
Both experimental and computational biology is becoming increasingly
automated. Laboratory experiments are now performed automatically on
high-throughput machinery, while computational models are synthesized or
inferred automatically from data. However, integration between automated tasks
in the process of biological discovery is still lacking, largely due to
incompatible or missing formal representations. While theories are expressed
formally as computational models, existing languages for encoding and
automating experimental protocols often lack formal semantics. This makes it
challenging to extract novel understanding by identifying when theory and
experimental evidence disagree due to errors in the models or the protocols
used to validate them. To address this, we formalize the syntax of a core
protocol language, which provides a unified description for the models of
biochemical systems being experimented on, together with the discrete events
representing the liquid-handling steps of biological protocols. We present both
a deterministic and a stochastic semantics to this language, both defined in
terms of hybrid processes. In particular, the stochastic semantics captures
uncertainties in equipment tolerances, making it a suitable tool for both
experimental and computational biologists. We illustrate how the proposed
protocol language can be used for automated verification and synthesis of
laboratory experiments on case studies from the fields of chemistry and
molecular programming
Sparsity-Sensitive Finite Abstraction
Abstraction of a continuous-space model into a finite state and input
dynamical model is a key step in formal controller synthesis tools. To date,
these software tools have been limited to systems of modest size (typically
6 dimensions) because the abstraction procedure suffers from an
exponential runtime with respect to the sum of state and input dimensions. We
present a simple modification to the abstraction algorithm that dramatically
reduces the computation time for systems exhibiting a sparse interconnection
structure. This modified procedure recovers the same abstraction as the one
computed by a brute force algorithm that disregards the sparsity. Examples
highlight speed-ups from existing benchmarks in the literature, synthesis of a
safety supervisory controller for a 12-dimensional and abstraction of a
51-dimensional vehicular traffic network
Improving Strategies via SMT Solving
We consider the problem of computing numerical invariants of programs by
abstract interpretation. Our method eschews two traditional sources of
imprecision: (i) the use of widening operators for enforcing convergence within
a finite number of iterations (ii) the use of merge operations (often, convex
hulls) at the merge points of the control flow graph. It instead computes the
least inductive invariant expressible in the domain at a restricted set of
program points, and analyzes the rest of the code en bloc. We emphasize that we
compute this inductive invariant precisely. For that we extend the strategy
improvement algorithm of [Gawlitza and Seidl, 2007]. If we applied their method
directly, we would have to solve an exponentially sized system of abstract
semantic equations, resulting in memory exhaustion. Instead, we keep the system
implicit and discover strategy improvements using SAT modulo real linear
arithmetic (SMT). For evaluating strategies we use linear programming. Our
algorithm has low polynomial space complexity and performs for contrived
examples in the worst case exponentially many strategy improvement steps; this
is unsurprising, since we show that the associated abstract reachability
problem is Pi-p-2-complete
Robust Model Predictive Control for Signal Temporal Logic Synthesis
Most automated systems operate in uncertain or adversarial conditions, and have to be capable of reliably reacting to changes in the environment. The focus of this paper is on automatically synthesizing reactive controllers for cyber-physical systems subject to signal temporal logic (STL) specifications. We build on recent work that encodes STL specifications as mixed integer linear constraints on the variables of a discrete-time model of the system and environment dynamics. To obtain a reactive controller, we present solutions to the worst-case model predictive control (MPC) problem using a suite of mixed integer linear programming techniques. We demonstrate the comparative effectiveness of several existing worst-case MPC techniques, when applied to the problem of control subject to temporal logic specifications; our empirical results emphasize the need to develop specialized solutions for this domain
- …