914 research outputs found
Configuration Path Control
Reinforcement learning methods often produce brittle policies -- policies
that perform well during training, but generalize poorly beyond their direct
training experience, thus becoming unstable under small disturbances. To
address this issue, we propose a method for stabilizing a control policy in the
space of configuration paths. It is applied post-training and relies purely on
the data produced during training, as well as on an instantaneous
control-matrix estimation. The approach is evaluated empirically on a planar
bipedal walker subjected to a variety of perturbations. The control policies
obtained via reinforcement learning are compared against their stabilized
counterparts. Across different experiments, we find two- to four-fold increase
in stability, when measured in terms of the perturbation amplitudes. We also
provide a zero-dynamics interpretation of our approach.Comment: 12 pages, 3 figures, accepted for publicatio
Symmetrizing quantum dynamics beyond gossip-type algorithms
Recently, consensus-type problems have been formulated in the quantum domain.
Obtaining average quantum consensus consists in the dynamical symmetrization of
a multipartite quantum system while preserving the expectation of a given
global observable. In this paper, two improved ways of obtaining consensus via
dissipative engineering are introduced, which employ on quasi local preparation
of mixtures of symmetric pure states, and show better performance in terms of
purity dynamics with respect to existing algorithms. In addition, the first
method can be used in combination with simple control resources in order to
engineer pure Dicke states, while the second method guarantees a stronger type
of consensus, namely single-measurement consensus. This implies that outcomes
of local measurements on different subsystems are perfectly correlated when
consensus is achieved. Both dynamics can be randomized and are suitable for
feedback implementation.Comment: 11 pages, 3 figure
Model Checking Finite-Horizon Markov Chains with Probabilistic Inference
We revisit the symbolic verification of Markov chains with respect to finite
horizon reachability properties. The prevalent approach iteratively computes
step-bounded state reachability probabilities. By contrast, recent advances in
probabilistic inference suggest symbolically representing all horizon-length
paths through the Markov chain. We ask whether this perspective advances the
state-of-the-art in probabilistic model checking. First, we formally describe
both approaches in order to highlight their key differences. Then, using these
insights we develop Rubicon, a tool that transpiles Prism models to the
probabilistic inference tool Dice. Finally, we demonstrate better scalability
compared to probabilistic model checkers on selected benchmarks. All together,
our results suggest that probabilistic inference is a valuable addition to the
probabilistic model checking portfolio -- with Rubicon as a first step towards
integrating both perspectives.Comment: Technical Report. Accepted at CAV 202
Constrained Stabilization of Discrete-Time Systems
Based on the growth rate of the set of states reachable with unit-energy inputs, we show that a discrete-time controllable linear system is globally controllable to the origin with constrained inputs if and only if all its eigenvalues lie in the closed unit disk. These results imply that the constrained Infinite-Horizon Model Predictive Control algorithm is globally stabilizing for a sufficiently large number of control moves if and only if the controlled system is controllable and all its eigenvalues lie in the closed unit disk.
In the second part of the paper, we propose an implementable Model Predictive Control algorithm and show that with this scheme a discrete-time linear system with n poles on the unit disk (with any multiplicity) can be globally stabilized if the number of control moves is larger than n. For pure integrator systems, this condition is also necessary. Moreover, we show that global asymptotic stability is preserved for any asymptotically constant disturbance entering at the plant input
- …