9,746 research outputs found
Active Classification for POMDPs: a Kalman-like State Estimator
The problem of state tracking with active observation control is considered
for a system modeled by a discrete-time, finite-state Markov chain observed
through conditionally Gaussian measurement vectors. The measurement model
statistics are shaped by the underlying state and an exogenous control input,
which influence the observations' quality. Exploiting an innovations approach,
an approximate minimum mean-squared error (MMSE) filter is derived to estimate
the Markov chain system state. To optimize the control strategy, the associated
mean-squared error is used as an optimization criterion in a partially
observable Markov decision process formulation. A stochastic dynamic
programming algorithm is proposed to solve for the optimal solution. To enhance
the quality of system state estimates, approximate MMSE smoothing estimators
are also derived. Finally, the performance of the proposed framework is
illustrated on the problem of physical activity detection in wireless body
sensing networks. The power of the proposed framework lies within its ability
to accommodate a broad spectrum of active classification applications including
sensor management for object classification and tracking, estimation of sparse
signals and radar scheduling.Comment: 38 pages, 6 figure
Labeled Directed Acyclic Graphs: a generalization of context-specific independence in directed graphical models
We introduce a novel class of labeled directed acyclic graph (LDAG) models
for finite sets of discrete variables. LDAGs generalize earlier proposals for
allowing local structures in the conditional probability distribution of a
node, such that unrestricted label sets determine which edges can be deleted
from the underlying directed acyclic graph (DAG) for a given context. Several
properties of these models are derived, including a generalization of the
concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is
enabled by introducing an LDAG-based factorization of the Dirichlet prior for
the model parameters, such that the marginal likelihood can be calculated
analytically. In addition, we develop a novel prior distribution for the model
structures that can appropriately penalize a model for its labeling complexity.
A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill
climbing approach is used for illustrating the useful properties of LDAG models
for both real and synthetic data sets.Comment: 26 pages, 17 figure
Shape-constrained Estimation of Value Functions
We present a fully nonparametric method to estimate the value function, via
simulation, in the context of expected infinite-horizon discounted rewards for
Markov chains. Estimating such value functions plays an important role in
approximate dynamic programming and applied probability in general. We
incorporate "soft information" into the estimation algorithm, such as knowledge
of convexity, monotonicity, or Lipchitz constants. In the presence of such
information, a nonparametric estimator for the value function can be computed
that is provably consistent as the simulated time horizon tends to infinity. As
an application, we implement our method on price tolling agreement contracts in
energy markets
Invariant Causal Prediction for Nonlinear Models
An important problem in many domains is to predict how a system will respond
to interventions. This task is inherently linked to estimating the system's
underlying causal structure. To this end, Invariant Causal Prediction (ICP)
(Peters et al., 2016) has been proposed which learns a causal model exploiting
the invariance of causal relations using data from different environments. When
considering linear models, the implementation of ICP is relatively
straightforward. However, the nonlinear case is more challenging due to the
difficulty of performing nonparametric tests for conditional independence. In
this work, we present and evaluate an array of methods for nonlinear and
nonparametric versions of ICP for learning the causal parents of given target
variables. We find that an approach which first fits a nonlinear model with
data pooled over all environments and then tests for differences between the
residual distributions across environments is quite robust across a large
variety of simulation settings. We call this procedure "invariant residual
distribution test". In general, we observe that the performance of all
approaches is critically dependent on the true (unknown) causal structure and
it becomes challenging to achieve high power if the parental set includes more
than two variables. As a real-world example, we consider fertility rate
modelling which is central to world population projections. We explore
predicting the effect of hypothetical interventions using the accepted models
from nonlinear ICP. The results reaffirm the previously observed central causal
role of child mortality rates
- …