Exponential Family Predictive Representations of State.
- Publication date
- Publisher
Abstract
Many agent-environment interactions can be framed as dynamical systems
in which agents take actions and receive observations. These
dynamical systems are diverse, representing such things as a biped
walking, a stock price changing over time, the trajectory of a
missile, or the shifting fish population in a lake.
Often, interacting successfully with the environment requires the use
of a model, which allows the agent to predict something about the
future by summarizing the past. Two of the basic problems in modeling
partially observable dynamical systems are selecting a representation
of state and selecting a mechanism for maintaining that state. This
thesis explores both problems from a learning perspective: we are
interested in learning a predictive model directly from the data that
arises as an agent interacts with its environment.
This thesis develops models for dynamical systems which represent
state as a set of statistics about the short-term future, as opposed
to treating state as a latent, unobservable quantity. In other words,
the agent summarizes the past into predictions about the short-term
future, which allow the agent to make further predictions about the
infinite future. Because all parameters in the model are defined
using only observable quantities, the learning algorithms for such
models are often straightforward and have attractive theoretical
properties. We examine in depth the case where state is represented
as the parameters of an exponential family distribution over a
short-term window of future observations. We unify a number of
different existing models under this umbrella, and predict and analyze
new models derived from the generalization.
One goal of this research is to push models with predictively defined
state towards real-world applications. We contribute models and
companion learning algorithms for domains with partial observability,
continuous observations, structured observations, high-dimensional
observations, and/or continuous actions. Our models successfully
capture standard POMDPs and benchmark nonlinear timeseries problems
with performance comparable to state-of-the-art models. They also
allow us to perform well on novel domains which are larger than those
captured by other models with predictively defined state, including
traffic prediction problems and domains analogous to autonomous mobile
robots with camera sensors.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/58522/1/wingated_1.pd