25,006 research outputs found
Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint
The classic objective in a reinforcement learning (RL) problem is to find a
policy that minimizes, in expectation, a long-run objective such as the
infinite-horizon discounted or long-run average cost. In many practical
applications, optimizing the expected value alone is not sufficient, and it may
be necessary to include a risk measure in the optimization process, either as
the objective or as a constraint. Various risk measures have been proposed in
the literature, e.g., mean-variance tradeoff, exponential utility, the
percentile performance, value at risk, conditional value at risk, prospect
theory and its later enhancement, cumulative prospect theory. In this article,
we focus on the combination of risk criteria and reinforcement learning in a
constrained optimization framework, i.e., a setting where the goal to find a
policy that optimizes the usual objective of infinite-horizon
discounted/average cost, while ensuring that an explicit risk constraint is
satisfied. We introduce the risk-constrained RL framework, cover popular risk
measures based on variance, conditional value-at-risk and cumulative prospect
theory, and present a template for a risk-sensitive RL algorithm. We survey
some of our recent work on this topic, covering problems encompassing
discounted cost, average cost, and stochastic shortest path settings, together
with the aforementioned risk measures in a constrained framework. This
non-exhaustive survey is aimed at giving a flavor of the challenges involved in
solving a risk-sensitive RL problem, and outlining some potential future
research directions
Second order adjoints for solving PDE-constrained optimization problems
Inverse problems are of utmost importance in many fields of science and engineering. In the
variational approach inverse problems are formulated as PDE-constrained optimization problems,
where the optimal estimate of the uncertain parameters is the minimizer of a certain cost
functional subject to the constraints posed by the model equations. The numerical solution
of such optimization problems requires the computation of derivatives of the model output
with respect to model parameters. The first order derivatives of a cost functional (defined
on the model output) with respect to a large number of model parameters can be calculated
efficiently through first order adjoint sensitivity analysis. Second order adjoint models
give second derivative information in the form of matrix-vector products between the Hessian
of the cost functional and user defined vectors. Traditionally, the construction of second
order derivatives for large scale models has been considered too costly. Consequently, data
assimilation applications employ optimization algorithms that use only first order derivative
information, like nonlinear conjugate gradients and quasi-Newton methods.
In this paper we discuss the mathematical foundations of second order adjoint sensitivity
analysis and show that it provides an efficient approach to obtain Hessian-vector products. We
study the benefits of using of second order information in the numerical optimization process
for data assimilation applications. The numerical studies are performed in a twin experiment
setting with a two-dimensional shallow water model. Different scenarios are considered with
different discretization approaches, observation sets, and noise levels. Optimization algorithms
that employ second order derivatives are tested against widely used methods that require
only first order derivatives. Conclusions are drawn regarding the potential benefits and the
limitations of using high-order information in large scale data assimilation problems
Radio Astronomical Image Formation using Constrained Least Squares and Krylov Subspaces
Image formation for radio astronomy can be defined as estimating the spatial
power distribution of celestial sources over the sky, given an array of
antennas. One of the challenges with image formation is that the problem
becomes ill-posed as the number of pixels becomes large. The introduction of
constraints that incorporate a-priori knowledge is crucial. In this paper we
show that in addition to non-negativity, the magnitude of each pixel in an
image is also bounded from above. Indeed, the classical "dirty image" is an
upper bound, but a much tighter upper bound can be formed from the data using
array processing techniques. This formulates image formation as a least squares
optimization problem with inequality constraints. We propose to solve this
constrained least squares problem using active set techniques, and the steps
needed to implement it are described. It is shown that the least squares part
of the problem can be efficiently implemented with Krylov subspace based
techniques, where the structure of the problem allows massive parallelism and
reduced storage needs. The performance of the algorithm is evaluated using
simulations
- …