13,704 research outputs found
Active Classification for POMDPs: a Kalman-like State Estimator
The problem of state tracking with active observation control is considered
for a system modeled by a discrete-time, finite-state Markov chain observed
through conditionally Gaussian measurement vectors. The measurement model
statistics are shaped by the underlying state and an exogenous control input,
which influence the observations' quality. Exploiting an innovations approach,
an approximate minimum mean-squared error (MMSE) filter is derived to estimate
the Markov chain system state. To optimize the control strategy, the associated
mean-squared error is used as an optimization criterion in a partially
observable Markov decision process formulation. A stochastic dynamic
programming algorithm is proposed to solve for the optimal solution. To enhance
the quality of system state estimates, approximate MMSE smoothing estimators
are also derived. Finally, the performance of the proposed framework is
illustrated on the problem of physical activity detection in wireless body
sensing networks. The power of the proposed framework lies within its ability
to accommodate a broad spectrum of active classification applications including
sensor management for object classification and tracking, estimation of sparse
signals and radar scheduling.Comment: 38 pages, 6 figure
Partially Observed Non-linear Risk-sensitive Optimal Stopping Control for Non-linear Discrete-time Systems
In this paper we introduce and solve the partially observed optimal stopping non-linear risk-sensitive stochastic control problem for discrete-time non-linear systems. The presented results are closely related to previous results for finite horizon partially observed risk-sensitive stochastic control problem. An information state approach is used and a new (three-way) separation principle established that leads to a forward dynamic programming equation and a backward dynamic programming inequality equation (both infinite dimensional). A verification theorem is given that establishes the optimal control and optimal stopping time. The risk-neutral optimal stopping stochastic control problem is also discussed
Dynamic Credit Investment in Partially Observed Markets
We consider the problem of maximizing expected utility for a power investor
who can allocate his wealth in a stock, a defaultable security, and a money
market account. The dynamics of these security prices are governed by geometric
Brownian motions modulated by a hidden continuous time finite state Markov
chain. We reduce the partially observed stochastic control problem to a
complete observation risk sensitive control problem via the filtered regime
switching probabilities. We separate the latter into pre-default and
post-default dynamic optimization subproblems, and obtain two coupled
Hamilton-Jacobi-Bellman (HJB) partial differential equations. We prove
existence and uniqueness of a globally bounded classical solution to each HJB
equation, and give the corresponding verification theorem. We provide a
numerical analysis showing that the investor increases his holdings in stock as
the filter probability of being in high growth regimes increases, and decreases
his credit risk exposure when the filter probability of being in high default
risk regimes gets larger
State-Observation Sampling and the Econometrics of Learning Models
In nonlinear state-space models, sequential learning about the hidden state
can proceed by particle filtering when the density of the observation
conditional on the state is available analytically (e.g. Gordon et al., 1993).
This condition need not hold in complex environments, such as the
incomplete-information equilibrium models considered in financial economics. In
this paper, we make two contributions to the learning literature. First, we
introduce a new filtering method, the state-observation sampling (SOS) filter,
for general state-space models with intractable observation densities. Second,
we develop an indirect inference-based estimator for a large class of
incomplete-information economies. We demonstrate the good performance of these
techniques on an asset pricing model with investor learning applied to over 80
years of daily equity returns
Robot introspection through learned hidden Markov models
In this paper we describe a machine learning approach for acquiring a model of a robot behaviour from raw sensor data. We are interested in automating the acquisition of behavioural models to provide a robot with an introspective capability. We assume that the behaviour of a robot in achieving a task can be modelled as a finite stochastic state transition system. Beginning with data recorded by a robot in the execution of a task, we use unsupervised learning techniques to estimate a hidden Markov model (HMM) that can be used both for predicting and explaining the behaviour of the robot in subsequent executions of the task. We demonstrate that it is feasible to automate the entire process of learning a high quality HMM from the data recorded by the robot during execution of its task.The learned HMM can be used both for monitoring and controlling the behaviour of the robot. The ultimate purpose of our work is to learn models for the full set of tasks associated with a given problem domain, and to integrate these models with a generative task planner. We want to show that these models can be used successfully in controlling the execution of a plan. However, this paper does not develop the planning and control aspects of our work, focussing instead on the learning methodology and the evaluation of a learned model. The essential property of the models we seek to construct is that the most probable trajectory through a model, given the observations made by the robot, accurately diagnoses, or explains, the behaviour that the robot actually performed when making these observations. In the work reported here we consider a navigation task. We explain the learning process, the experimental setup and the structure of the resulting learned behavioural models. We then evaluate the extent to which explanations proposed by the learned models accord with a human observer's interpretation of the behaviour exhibited by the robot in its execution of the task
- …