Missing and Noisy Data in Nonlinear Time-Series
- Publication date
- 1995
- Publisher
Abstract
Comment added in October, 2003: This paper is now of mostly historical importance. At the time of publication (1995) it was one of the first machine learning papers to stress the importance of stochastic sampling in time-series prediction and time-series model learning. In this paper we suggested to use Gibbs sampling (Section 4), nowadays particle filters are commonly used instead. Secondly, this is one of the first papers in machine learning to derive the gradient equations for control optimization in reinforcement learning policy-space search methods (Section 6.3). The only previous publication on policy-space search methods to our knowledge is: Williams, Ronald J. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8:229-256. Since our paper was adressed to a neural network community, we focussed on a neural network representation with Gaussian noise. In Section 6.3, under the subtitle Stochastic Control we derive the gradients for o#ine policy-space search methods. Here, a l keeps a trace of the gradient and e l accumulates gradient times cost information. Under the subtitle On-line Adaptation we derive the gradients for online policy-space search methods and make the connection to value functions. Unfortunately, we never found the time to follow-up on this paper. Part of the reason was that the RL-experts to whom we presented this paper at the time of publication did not exhibit much interest.] We discuss the issue of missing and noisy data in nonlinear time-series prediction. We derive fundamental equations both for prediction and for training. Our discussion shows that if measurements are noisy or missing, treating the time series as a static input/output mapping problem (the usual time-delay n..