The current paper proposes a novel predictive coding type neural network
model, the predictive multiple spatio-temporal scales recurrent neural network
(P-MSTRNN). The P-MSTRNN learns to predict visually perceived human whole-body
cyclic movement patterns by exploiting multiscale spatio-temporal constraints
imposed on network dynamics by using differently sized receptive fields as well
as different time constant values for each layer. After learning, the network
becomes able to proactively imitate target movement patterns by inferring or
recognizing corresponding intentions by means of the regression of prediction
error. Results show that the network can develop a functional hierarchy by
developing a different type of dynamic structure at each layer. The paper
examines how model performance during pattern generation as well as predictive
imitation varies depending on the stage of learning. The number of limit cycle
attractors corresponding to target movement patterns increases as learning
proceeds. And, transient dynamics developing early in the learning process
successfully perform pattern generation and predictive imitation tasks. The
paper concludes that exploitation of transient dynamics facilitates successful
task performance during early learning periods.Comment: Accepted in Neural Computation (MIT press