We study conditions under which, given a dictionary F={f1,…,fM}
and an i.i.d. sample (Xi,Yi)i=1N, the empirical minimizer in
span(F) relative to the squared loss, satisfies that with
high probability
R(f~ERM)≤f∈span(F)infR(f)+rN(M), where R(⋅) is the squared risk and rN(M) is
of the order of M/N. Among other results, we prove that a uniform small-ball
estimate for functions in span(F) is enough to achieve that
goal when the noise is independent of the design.Comment: Published at http://dx.doi.org/10.3150/15-BEJ701 in the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm