Ensemble learning of K nonlinear perceptrons, which determine their outputs
by sign functions, is discussed within the framework of online learning and
statistical mechanics. One purpose of statistical learning theory is to
theoretically obtain the generalization error. This paper shows that ensemble
generalization error can be calculated by using two order parameters, that is,
the similarity between a teacher and a student, and the similarity among
students. The differential equations that describe the dynamical behaviors of
these order parameters are derived in the case of general learning rules. The
concrete forms of these differential equations are derived analytically in the
cases of three well-known rules: Hebbian learning, perceptron learning and
AdaTron learning. Ensemble generalization errors of these three rules are
calculated by using the results determined by solving their differential
equations. As a result, these three rules show different characteristics in
their affinity for ensemble learning, that is ``maintaining variety among
students." Results show that AdaTron learning is superior to the other two
rules with respect to that affinity.Comment: 30 pages, 17 figure