7 research outputs found
Semantic verification of dynamic programming
We prove that the generic framework for specifying and solving
finite-horizon, monadic sequential decision problems proposed in (Botta et
al.,2017) is semantically correct. By semantically correct we mean that, for a
problem specification and for any initial state compatible with ,
the verified optimal policies obtained with the framework maximize the
-measure of the -sums of the -rewards along all the possible
trajectories rooted in . In short, we prove that, given , the verified
computations encoded in the framework are the correct computations to do. The
main theorem is formulated as an equivalence between two value functions: the
first lies at the core of dynamic programming as originally formulated in
(Bellman,1957) and formalized by Botta et al. in Idris (Brady,2017), and the
second is a specification. The equivalence requires the two value functions to
be extensionally equal. Further, we identify and discuss three requirements
that measures of uncertainty have to fulfill for the main theorem to hold.
These turn out to be rather natural conditions that the expected-value measure
of stochastic uncertainty fulfills. The formal proof of the main theorem
crucially relies on a principle of preservation of extensional equality for
functors. We formulate and prove the semantic correctness of dynamic
programming as an extension of the Botta et al. Idris framework. However, the
theory can easily be implemented in Coq or Agda.Comment: Manuscript ID: JFP-2020-003
Sequential decision problems, dependent types and generic solutions
We present a computer-checked generic implementation for solving finite horizon sequential decision problems. This is a wide class of problems, including intertemporal optimizations, knapsack, optimal bracketing, scheduling, etc. The implementation can handle time-step dependent control and state spaces, and monadic representations of uncertainty (such as stochastic, non-deterministic, fuzzy, or combinations thereof). This level of genericity is achievable in a programming language with dependent types (we have used both Idris and Agda). Dependent types are also the means that allow us to obtain a formalization and computer-checked proof of the central component of our implementation: Bellman’s principle of optimality and the associated backwards induction algorithm. The formalization clarifies certain aspects of backwards induction and, by making explicit notions such as viability and reachability, can serve as a starting point for a theory of controllability of monadic dynamical systems, commonly encountered in, e.g., climate impact research.Publisher PDFPeer reviewe
Extensional equality preservation and verified generic programming
In verified generic programming, one cannot exploit the structure of concrete
data types but has to rely on well chosen sets of specifications or abstract
data types (ADTs). Functors and monads are at the core of many applications of
functional programming. This raises the question of what useful ADTs for
verified functors and monads could look like. The functorial map of many
important monads preserves extensional equality. For instance, if are extensionally equal, that is, , then and are also
extensionally equal. This suggests that preservation of extensional equality
could be a useful principle in verified generic programming. We explore this
possibility with a minimalist approach: we deal with (the lack of) extensional
equality in Martin-L\"of's intensional type theories without extending the
theories or using full-fledged setoids. Perhaps surprisingly, this minimal
approach turns out to be extremely useful. It allows one to derive simple
generic proofs of monadic laws but also verified, generic results in dynamical
systems and control theory. In turn, these results avoid tedious code
duplication and ad-hoc proofs. Thus, our work is a contribution towards
pragmatic, verified generic programming.Comment: Manuscript ID: JFP-2020-003
On the correctness of monadic backward induction
In control theory, to solve a finite-horizon sequential decision problem (SDP) commonly means to find a list of decision rules that result in an optimal expected total reward (or cost) when taking a given number of decision steps. SDPs are routinely solved using Bellman\u27s backward induction. Textbook authors (e.g. Bertsekas or Puterman) typically give more or less formal proofs to show that the backward induction algorithm is correct as solution method for deterministic and stochastic SDPs. Botta, Jansson and Ionescu propose a generic framework for finite horizon, monadic SDPs together with a monadic version of backward induction for solving such SDPs. In monadic SDPs, the monad captures a generic notion of uncertainty, while a generic measure function aggregates rewards. In the present paper, we define a notion of correctness for monadic SDPs and identify three conditions that allow us to prove a correctness result for monadic backward induction that is comparable to textbook correctness proofs for ordinary backward induction. The conditions that we impose are fairly general and can be cast in category-theoretical terms using the notion of Eilenberg-Moore algebra. They hold in familiar settings like those of deterministic or stochastic SDPs, but we also give examples in which they fail. Our results show that backward induction can safely be employed for a broader class of SDPs than usually treated in textbooks. However, they also rule out certain instances that were considered admissible in the context of Botta et al. \u27s generic framework. Our development is formalised in Idris as an extension of the Botta et al. framework and the sources are available as supplementary material
Contributions to a computational theory of policy advice and avoidability
We present the starting elements of a mathematical theory of policy advice and avoidability. More specifically, we formalize a cluster of notions related to policy advice, such as policy, viability, reachability, and propose a novel approach for assisting decision making, based on the concept of avoidability. We formalize avoidability as a relation between current and future states, investigate under which conditions this relation is decidable and propose a generic procedure for assessing avoidability. The formalization is constructive and makes extensive use of the correspondence between dependent types and logical propositions, decidable judgments are obtained through computations. Thus, we aim for a computational theory, and emphasize the role that computer science can play in global system science