We study a simple learning model based on the Hebb rule to cope with
"delayed", unspecific reinforcement. In spite of the unspecific nature of the
information-feedback, convergence to asymptotically perfect generalization is
observed, with a rate depending, however, in a non- universal way on learning
parameters. Asymptotic convergence can be as fast as that of Hebbian learning,
but may be slower. Moreover, for a certain range of parameter settings, it
depends on initial conditions whether the system can reach the regime of
asymptotically perfect generalization, or rather approaches a stationary state
of poor generalization.Comment: 13 pages LaTeX, 4 figures, note on biologically motivated stochastic
variant of the algorithm adde