The permutation symmetry of the hidden units in multilayer perceptrons causes
the saddle structure and plateaus of the learning dynamics in gradient learning
methods. The correlation of the weight vectors of hidden units in a teacher
network is thought to affect this saddle structure, resulting in a prolonged
learning time, but this mechanism is still unclear. In this paper, we discuss
it with regard to soft committee machines and on-line learning using
statistical mechanics. Conventional gradient descent needs more time to break
the symmetry as the correlation of the teacher weight vectors rises. On the
other hand, no plateaus occur with natural gradient descent regardless of the
correlation for the limit of a low learning rate. Analytical results support
these dynamics around the saddle point.Comment: 7 pages, 6 figure