1 research outputs found

    1 Differential Trace in Learning of Value Function with a Neural Network

    No full text
    Abstract: Reinforcement learning has a fatal problem of slow learning. To solve this problem, Eligibility-trace has been widely used. However, since the trace throws away old information and takes the present information constantly not depending on whether the information is important or not, long-term learning and short-term learning are incompatible. In this paper, a novel approach called ”Differential trace ” is proposed, in which the trace is not updated constantly, but according to the change of each neuron’s output in a neural network. In other words, the time axis is subjectively adjusted in each neuron. The characteristics of the Differential trace could be observed in the learning of state value in a simple task where one-dimensional continuous environment is divided into 100 states. The learning performance is better in total than the case of Eligibility trace with either of two decay rates.
    corecore