Abstract. The term “nexting ” has been used by psychologists to refer to the propensity of people and many other animals to continually pre-dict what will happen next in an immediate, local, and personal sense. The ability to “next ” constitutes a basic kind of awareness and knowl-edge of one’s environment. In this paper we present results with a robot that learns to next in real time, predicting thousands of features of the world’s state, including all sensory inputs, at timescales from 0.1 to 8 sec-onds. This was achieved by treating each state feature as a reward-like target and applying temporal-difference methods to learn a correspond-ing value function with a discount rate corresponding to the timescale. We show that two thousand predictions, each dependent on six thousand state features, can be learned and updated online at better than 10Hz on a laptop computer, using the standard TD(λ) algorithm with linear function approximation. We show that this approach is efficient enough to be practical, with most of the learning complete within 30 minutes. W
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.