13,704 research outputs found

    Active Classification for POMDPs: a Kalman-like State Estimator

    Full text link
    The problem of state tracking with active observation control is considered for a system modeled by a discrete-time, finite-state Markov chain observed through conditionally Gaussian measurement vectors. The measurement model statistics are shaped by the underlying state and an exogenous control input, which influence the observations' quality. Exploiting an innovations approach, an approximate minimum mean-squared error (MMSE) filter is derived to estimate the Markov chain system state. To optimize the control strategy, the associated mean-squared error is used as an optimization criterion in a partially observable Markov decision process formulation. A stochastic dynamic programming algorithm is proposed to solve for the optimal solution. To enhance the quality of system state estimates, approximate MMSE smoothing estimators are also derived. Finally, the performance of the proposed framework is illustrated on the problem of physical activity detection in wireless body sensing networks. The power of the proposed framework lies within its ability to accommodate a broad spectrum of active classification applications including sensor management for object classification and tracking, estimation of sparse signals and radar scheduling.Comment: 38 pages, 6 figure

    Partially Observed Non-linear Risk-sensitive Optimal Stopping Control for Non-linear Discrete-time Systems

    Get PDF
    In this paper we introduce and solve the partially observed optimal stopping non-linear risk-sensitive stochastic control problem for discrete-time non-linear systems. The presented results are closely related to previous results for finite horizon partially observed risk-sensitive stochastic control problem. An information state approach is used and a new (three-way) separation principle established that leads to a forward dynamic programming equation and a backward dynamic programming inequality equation (both infinite dimensional). A verification theorem is given that establishes the optimal control and optimal stopping time. The risk-neutral optimal stopping stochastic control problem is also discussed

    Dynamic Credit Investment in Partially Observed Markets

    Get PDF
    We consider the problem of maximizing expected utility for a power investor who can allocate his wealth in a stock, a defaultable security, and a money market account. The dynamics of these security prices are governed by geometric Brownian motions modulated by a hidden continuous time finite state Markov chain. We reduce the partially observed stochastic control problem to a complete observation risk sensitive control problem via the filtered regime switching probabilities. We separate the latter into pre-default and post-default dynamic optimization subproblems, and obtain two coupled Hamilton-Jacobi-Bellman (HJB) partial differential equations. We prove existence and uniqueness of a globally bounded classical solution to each HJB equation, and give the corresponding verification theorem. We provide a numerical analysis showing that the investor increases his holdings in stock as the filter probability of being in high growth regimes increases, and decreases his credit risk exposure when the filter probability of being in high default risk regimes gets larger

    State-Observation Sampling and the Econometrics of Learning Models

    Full text link
    In nonlinear state-space models, sequential learning about the hidden state can proceed by particle filtering when the density of the observation conditional on the state is available analytically (e.g. Gordon et al., 1993). This condition need not hold in complex environments, such as the incomplete-information equilibrium models considered in financial economics. In this paper, we make two contributions to the learning literature. First, we introduce a new filtering method, the state-observation sampling (SOS) filter, for general state-space models with intractable observation densities. Second, we develop an indirect inference-based estimator for a large class of incomplete-information economies. We demonstrate the good performance of these techniques on an asset pricing model with investor learning applied to over 80 years of daily equity returns

    Robot introspection through learned hidden Markov models

    Get PDF
    In this paper we describe a machine learning approach for acquiring a model of a robot behaviour from raw sensor data. We are interested in automating the acquisition of behavioural models to provide a robot with an introspective capability. We assume that the behaviour of a robot in achieving a task can be modelled as a finite stochastic state transition system. Beginning with data recorded by a robot in the execution of a task, we use unsupervised learning techniques to estimate a hidden Markov model (HMM) that can be used both for predicting and explaining the behaviour of the robot in subsequent executions of the task. We demonstrate that it is feasible to automate the entire process of learning a high quality HMM from the data recorded by the robot during execution of its task.The learned HMM can be used both for monitoring and controlling the behaviour of the robot. The ultimate purpose of our work is to learn models for the full set of tasks associated with a given problem domain, and to integrate these models with a generative task planner. We want to show that these models can be used successfully in controlling the execution of a plan. However, this paper does not develop the planning and control aspects of our work, focussing instead on the learning methodology and the evaluation of a learned model. The essential property of the models we seek to construct is that the most probable trajectory through a model, given the observations made by the robot, accurately diagnoses, or explains, the behaviour that the robot actually performed when making these observations. In the work reported here we consider a navigation task. We explain the learning process, the experimental setup and the structure of the resulting learned behavioural models. We then evaluate the extent to which explanations proposed by the learned models accord with a human observer's interpretation of the behaviour exhibited by the robot in its execution of the task
    corecore