11 research outputs found
Risk-Constrained Control of Mean-Field Linear Quadratic Systems
The risk-neutral LQR controller is optimal for stochastic linear dynamical
systems. However, the classical optimal controller performs inefficiently in
the presence of low-probability yet statistically significant (risky) events.
The present research focuses on infinite-horizon risk-constrained linear
quadratic regulators in a mean-field setting. We address the risk constraint by
bounding the cumulative one-stage variance of the state penalty of all players.
It is shown that the optimal controller is affine in the state of each player
with an additive term that controls the risk constraint. In addition, we
propose a solution independent of the number of players. Finally, simulations
are presented to verify the theoretical findings.Comment: Accepted at 62nd IEEE Conference on Decision and Contro
Decentralized Stochastic Linear-Quadratic Optimal Control with Risk Constraint and Partial Observation
This paper addresses a risk-constrained decentralized stochastic
linear-quadratic optimal control problem with one remote controller and one
local controller, where the risk constraint is posed on the cumulative state
weighted variance in order to reduce the oscillation of system trajectory. In
this model, local controller can only partially observe the system state, and
sends the estimate of state to remote controller through an unreliable channel,
whereas the channel from remote controller to local controllers is perfect. For
the considered constrained optimization problem, we first punish the risk
constraint into cost function through Lagrange multiplier method, and the
resulting augmented cost function will include a quadratic mean-field term of
state. In the sequel, for any but fixed multiplier, explicit solutions to
finite-horizon and infinite-horizon mean-field decentralized linear-quadratic
problems are derived together with necessary and sufficient condition on the
mean-square stability of optimal system. Then, approach to find the optimal
Lagrange multiplier is presented based on bisection method. Finally, two
numerical examples are given to show the efficiency of the obtained results
Global Convergence of Policy Gradient Primal-dual Methods for Risk-constrained LQRs
While the techniques in optimal control theory are often model-based, the
policy optimization (PO) approach can directly optimize the performance metric
of interest without explicit dynamical models, and is an essential approach for
reinforcement learning problems. However, it usually leads to a non-convex
optimization problem in most cases, where there is little theoretical
understanding on its performance. In this paper, we focus on the
risk-constrained Linear Quadratic Regulator (LQR) problem with noisy input via
the PO approach, which results in a challenging non-convex problem. To this
end, we first build on our earlier result that the optimal policy has an affine
structure to show that the associated Lagrangian function is locally gradient
dominated with respect to the policy, based on which we establish strong
duality. Then, we design policy gradient primal-dual methods with global
convergence guarantees to find an optimal policy-multiplier pair in both
model-based and sample-based settings. Finally, we use samples of system
trajectories in simulations to validate our policy gradient primal-dual
methods
Risk-Aware Stability of Discrete-Time Systems
We develop a generalized stability framework for stochastic discrete-time
systems, where the generality pertains to the ways in which the distribution of
the state energy can be characterized. We use tools from finance and operations
research called risk functionals (i.e., risk measures) to facilitate diverse
distributional characterizations. In contrast, classical stochastic stability
notions characterize the state energy on average or in probability, which can
obscure the variability of stochastic system behavior. After drawing
connections between various risk-aware stability concepts for nonlinear
systems, we specialize to linear systems and derive sufficient conditions for
the satisfaction of some risk-aware stability properties. These results pertain
to real-valued coherent risk functionals and a mean-conditional-variance
functional. The results reveal novel noise-to-state stability properties, which
assess disturbances in ways that reflect the chosen measure of risk. We
illustrate the theory through examples about robustness, parameter choices, and
state-feedback controllers
On Optimizing the Conditional Value-at-Risk of a Maximum Cost for Risk-Averse Safety Analysis
The popularity of Conditional Value-at-Risk (CVaR), a risk functional from
finance, has been growing in the control systems community due to its intuitive
interpretation and axiomatic foundation. We consider a non-standard optimal
control problem in which the goal is to minimize the CVaR of a maximum random
cost subject to a Borel-space Markov decision process. The objective takes the
form , where is a
risk-aversion parameter representing a fraction of worst cases, is a
stage or terminal cost, and is the length of a finite
discrete-time horizon. The objective represents the maximum departure from a
desired operating region averaged over a given fraction of worst
cases. This problem provides a safety criterion for a stochastic system that is
informed by both the probability and severity of the potential consequences of
the system's trajectory. In contrast, existing safety analysis frameworks apply
stage-wise risk constraints (i.e., must be small for all , where
is a risk functional) or assess the probability of constraint violation
without quantifying its possible severity. To the best of our knowledge, the
problem of interest has not been solved. To solve the problem, we propose and
study a family of stochastic dynamic programs on an augmented state space. We
prove that the optimal CVaR of a maximum cost enjoys an equivalent
representation in terms of the solutions to this family of dynamic programs
under appropriate assumptions. We show the existence of an optimal policy that
depends on the dynamics of an augmented state under a measurable selection
condition. Moreover, we demonstrate how our safety analysis framework is useful
for assessing the severity of combined sewer overflows under precipitation
uncertainty.Comment: A shorter version is under review for IEEE Transactions on Automatic
Control, submitted December 202
Optimization in Dynamical Systems: Theory and Application
In this dissertation, we study optimization methods in interconnected systems and investigate their applications in robotics, energy harvesting, and mean-field linear quadratic multi-agent systems. We first focus on parallel robots. Parallel Robots have numerous applications in motion simulation systems and high-precision instruments. Specifically, we investigate the forward kinematics (FK) of parallel robots and formulate it as an error minimization problem. Following this formulation, we develop an optimization algorithm to solve FK and provide a theoretical analysis of the convergence of the proposed algorithm. Then, we investigate the energy optimization (maximization) in a specific class of micro-energy harvesters (MEH). These types of energy harvesters are known to extract the largest amount of power from the kinetic energy of the human body, making them an appropriate choice for wearable technology in healthcare applications. Employing machine learning tools and using the existing models for the MEH's kinematics, we propose three methods for energy maximization. Next, we study optimal control in a mean-field linear quadratic system. Mean-field systems have critical applications in approximating very large-scale systems' behavior. Specifically, we establish results on the convergence of policy gradient (PG) methods to the optimal solution in a mean-field linear quadratic game. We finally consider the risk-constrained control of agents in a mean-field linear quadratic setting. Simulations validate the theoretical findings and their effectiveness
Constrained Learning And Inference
Data and learning have become core components of the information processing and autonomous systems upon which we increasingly rely on to select job applicants, analyze medical data, and drive cars. As these systems become ubiquitous, so does the need to curtail their behavior. Left untethered, they are susceptible to tampering (adversarial examples) and prone to prejudiced and unsafe actions. Currently, the response of these systems is tailored by leveraging domain expert knowledge to either construct models that embed the desired properties or tune the training objective so as to promote them. While effective, these solutions are often targeted to specific behaviors, contexts, and sometimes even problem instances and are typically not transferable across models and applications. What is more, the growing scale and complexity of modern information processing and autonomous systems renders this manual behavior tuning infeasible. Already today, explainability, interpretability, and transparency combined with human judgment are no longer enough to design systems that perform according to specifications.
The present thesis addresses these issues by leveraging constrained statistical optimization. More specifically, it develops the theoretical underpinnings of constrained learning and constrained inference to provide tools that enable solving statistical problems under requirements. Starting with the task of learning under requirements, it develops a generalization theory of constrained learning akin to the existing unconstrained one. By formalizing the concept of probability approximately correct constrained (PACC) learning, it shows that constrained learning is as hard as its unconstrained learning and establishes the constrained counterpart of empirical risk minimization (ERM) as a PACC learner. To overcome challenges involved in solving such non-convex constrained optimization problems, it derives a dual learning rule that enables constrained learning tasks to be tackled by through unconstrained learning problems only. It therefore concludes that if we can deal with classical, unconstrained learning tasks, then we can deal with learning tasks with requirements.
The second part of this thesis addresses the issue of constrained inference. In particular, the issue of performing inference using sparse nonlinear function models, combinatorial constrained with quadratic objectives, and risk constraints. Such models arise in nonlinear line spectrum estimation, functional data analysis, sensor selection, actuator scheduling, experimental design, and risk-aware estimation. Although inference problems assume that models and distributions are known, each of these constraints pose serious challenges that hinder their use in practice. Sparse nonlinear functional models lead to infinite dimensional, non-convex optimization programs that cannot be discretized without leading to combinatorial, often NP-hard, problems. Rather than using surrogates and relaxations, this work relies on duality to show that despite their apparent complexity, these models can be fit efficiently, i.e., in polynomial time. While quadratic objectives are typically tractable (often even in closed form), they lead to non-submodular optimization problems when subject to cardinality or matroid constraints. While submodular functions are sometimes used as surrogates, this work instead shows that quadratic functions are close to submodular and can also be optimized near-optimally. The last chapter of this thesis is dedicated to problems involving risk constraints, in particular, bounded predictive mean square error variance estimation. Despite being non-convex, such problems are equivalent to a quadratically constrained quadratic program from which a closed-form estimator can be extracted.
These results are used throughout this thesis to tackle problems in signal processing, machine learning, and control, such as fair learning, robust learning, nonlinear line spectrum estimation, actuator scheduling, experimental design, and risk-aware estimation. Yet, they are applicable much beyond these illustrations to perform safe reinforcement learning, sensor selection, multiresolution kernel estimation, and wireless resource allocation, to name a few