75,075 research outputs found
On finite-dimensional risk-sensitive estimation
In this paper, we address the finite-dimensionality issues regarding discrete-time risk-sensitive estimation for stochastic
nonlinear systems. We show that for a bilinear system with
an unknown parameter, finite-dimensional risk-sensitive estimates can be obtained. A necessary condition is obtained
for nonlinear systems with no process noise such that one
can obtain finite-dimensional risk-sensitive estimates
On the Separation of Estimation and Control in Risk-Sensitive Investment Problems under Incomplete Observation
A typical approach to tackle stochastic control problems with partial
observation is to separate the control and estimation tasks. However, it is
well known that this separation generally fails to deliver an actual optimal
solution for risk-sensitive control problems. This paper investigates the
separability of a general class of risk-sensitive investment management
problems when a finite-dimensional filter exists. We show that the
corresponding separated problem, where instead of the unobserved quantities,
one considers their conditional filter distribution given the observations, is
strictly equivalent to the original control problem. We widen the applicability
of the so-called Modified Zakai Equation (MZE) for the study of the separated
problem and prove that the MZE simplifies to a PDE in our approach.
Furthermore, we derive criteria for separability. We do not solve the separated
control problem but note that the existence of a finite-dimensional filter
leads to a finite state space for the separated problem. Hence, the difficulty
is equivalent to solving a complete observation risk-sensitive problem. Our
results have implications for existing risk-sensitive investment management
models with partial observations in that they establish their separability.
Their implications for future research on new applications is mainly to provide
conditions to ensure separability
Quantum risk-sensitive estimation and robustness
This paper studies a quantum risk-sensitive estimation problem and
investigates robustness properties of the filter. This is a direct extension to
the quantum case of analogous classical results. All investigations are based
on a discrete approximation model of the quantum system under consideration.
This allows us to study the problem in a simple mathematical setting. We close
the paper with some examples that demonstrate the robustness of the
risk-sensitive estimator.Comment: 24 page
Partially Observed Non-linear Risk-sensitive Optimal Stopping Control for Non-linear Discrete-time Systems
In this paper we introduce and solve the partially observed optimal stopping non-linear risk-sensitive stochastic control problem for discrete-time non-linear systems. The presented results are closely related to previous results for finite horizon partially observed risk-sensitive stochastic control problem. An information state approach is used and a new (three-way) separation principle established that leads to a forward dynamic programming equation and a backward dynamic programming inequality equation (both infinite dimensional). A verification theorem is given that establishes the optimal control and optimal stopping time. The risk-neutral optimal stopping stochastic control problem is also discussed
Optimal Data Acquisition for Statistical Estimation
We consider a data analyst's problem of purchasing data from strategic agents
to compute an unbiased estimate of a statistic of interest. Agents incur
private costs to reveal their data and the costs can be arbitrarily correlated
with their data. Once revealed, data are verifiable. This paper focuses on
linear unbiased estimators. We design an individually rational and incentive
compatible mechanism that optimizes the worst-case mean-squared error of the
estimation, where the worst-case is over the unknown correlation between costs
and data, subject to a budget constraint in expectation. We characterize the
form of the optimal mechanism in closed-form. We further extend our results to
acquiring data for estimating a parameter in regression analysis, where private
costs can correlate with the values of the dependent variable but not with the
values of the independent variables
Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint
The classic objective in a reinforcement learning (RL) problem is to find a
policy that minimizes, in expectation, a long-run objective such as the
infinite-horizon discounted or long-run average cost. In many practical
applications, optimizing the expected value alone is not sufficient, and it may
be necessary to include a risk measure in the optimization process, either as
the objective or as a constraint. Various risk measures have been proposed in
the literature, e.g., mean-variance tradeoff, exponential utility, the
percentile performance, value at risk, conditional value at risk, prospect
theory and its later enhancement, cumulative prospect theory. In this article,
we focus on the combination of risk criteria and reinforcement learning in a
constrained optimization framework, i.e., a setting where the goal to find a
policy that optimizes the usual objective of infinite-horizon
discounted/average cost, while ensuring that an explicit risk constraint is
satisfied. We introduce the risk-constrained RL framework, cover popular risk
measures based on variance, conditional value-at-risk and cumulative prospect
theory, and present a template for a risk-sensitive RL algorithm. We survey
some of our recent work on this topic, covering problems encompassing
discounted cost, average cost, and stochastic shortest path settings, together
with the aforementioned risk measures in a constrained framework. This
non-exhaustive survey is aimed at giving a flavor of the challenges involved in
solving a risk-sensitive RL problem, and outlining some potential future
research directions
Optimizing the CVaR via Sampling
Conditional Value at Risk (CVaR) is a prominent risk measure that is being
used extensively in various domains. We develop a new formula for the gradient
of the CVaR in the form of a conditional expectation. Based on this formula, we
propose a novel sampling-based estimator for the CVaR gradient, in the spirit
of the likelihood-ratio method. We analyze the bias of the estimator, and prove
the convergence of a corresponding stochastic gradient descent algorithm to a
local CVaR optimum. Our method allows to consider CVaR optimization in new
domains. As an example, we consider a reinforcement learning application, and
learn a risk-sensitive controller for the game of Tetris.Comment: To appear in AAAI 201
- …