19 research outputs found
A New Distribution-Free Concept for Representing, Comparing, and Propagating Uncertainty in Dynamical Systems with Kernel Probabilistic Programming
This work presents the concept of kernel mean embedding and kernel
probabilistic programming in the context of stochastic systems. We propose
formulations to represent, compare, and propagate uncertainties for fairly
general stochastic dynamics in a distribution-free manner. The new tools enjoy
sound theory rooted in functional analysis and wide applicability as
demonstrated in distinct numerical examples. The implication of this new
concept is a new mode of thinking about the statistical nature of uncertainty
in dynamical systems
Predictive-State Decoders: Encoding the Future into Recurrent Networks
Recurrent neural networks (RNNs) are a vital modeling technique that rely on
internal states learned indirectly by optimization of a supervised,
unsupervised, or reinforcement training loss. RNNs are used to model dynamic
processes that are characterized by underlying latent states whose form is
often unknown, precluding its analytic representation inside an RNN. In the
Predictive-State Representation (PSR) literature, latent state processes are
modeled by an internal state representation that directly models the
distribution of future observations, and most recent work in this area has
relied on explicitly representing and targeting sufficient statistics of this
probability distribution. We seek to combine the advantages of RNNs and PSRs by
augmenting existing state-of-the-art recurrent neural networks with
Predictive-State Decoders (PSDs), which add supervision to the network's
internal state representation to target predicting future observations.
Predictive-State Decoders are simple to implement and easily incorporated into
existing training pipelines via additional loss regularization. We demonstrate
the effectiveness of PSDs with experimental results in three different domains:
probabilistic filtering, Imitation Learning, and Reinforcement Learning. In
each, our method improves statistical performance of state-of-the-art recurrent
baselines and does so with fewer iterations and less data.Comment: NIPS 201
Learning Causal State Representations of Partially Observable Environments
Intelligent agents can cope with sensory-rich environments by learning task-agnostic state abstractions. In this paper, we propose mechanisms to approximate causal states, which optimally compress the joint history of actions and observations in partially-observable Markov decision processes. Our proposed algorithm extracts causal state representations from RNNs that are trained to predict subsequent observations given the history. We demonstrate that these learned task-agnostic state abstractions can be used to efficiently learn policies for reinforcement learning problems with rich observation spaces. We evaluate agents using multiple partially observable navigation tasks with both discrete (GridWorld) and continuous (VizDoom, ALE) observation processes that cannot be solved by traditional memory-limited methods. Our experiments demonstrate systematic improvement of the DQN and tabular models using approximate causal state representations with respect to recurrent-DQN baselines trained with raw inputs
Kernel Instrumental Variable Regression
Instrumental variable (IV) regression is a strategy for learning causal
relationships in observational data. If measurements of input X and output Y
are confounded, the causal relationship can nonetheless be identified if an
instrumental variable Z is available that influences X directly, but is
conditionally independent of Y given X and the unmeasured confounder. The
classic two-stage least squares algorithm (2SLS) simplifies the estimation
problem by modeling all relationships as linear functions. We propose kernel
instrumental variable regression (KIV), a nonparametric generalization of 2SLS,
modeling relations among X, Y, and Z as nonlinear functions in reproducing
kernel Hilbert spaces (RKHSs). We prove the consistency of KIV under mild
assumptions, and derive conditions under which convergence occurs at the
minimax optimal rate for unconfounded, single-stage RKHS regression. In doing
so, we obtain an efficient ratio between training sample sizes used in the
algorithm's first and second stages. In experiments, KIV outperforms state of
the art alternatives for nonparametric IV regression.Comment: 41 pages, 11 figures. Advances in Neural Information Processing
Systems. 201