We consider the least-squares regression problem and provide a detailed
asymptotic analysis of the performance of averaged constant-step-size
stochastic gradient descent (a.k.a. least-mean-squares). In the strongly-convex
case, we provide an asymptotic expansion up to explicit exponentially decaying
terms. Our analysis leads to new insights into stochastic approximation
algorithms: (a) it gives a tighter bound on the allowed step-size; (b) the
generalization error may be divided into a variance term which is decaying as
O(1/n), independently of the step-size γ, and a bias term that decays as
O(1/γ 2 n 2); (c) when allowing non-uniform sampling, the choice of a
good sampling density depends on whether the variance or bias terms dominate.
In particular, when the variance term dominates, optimal sampling densities do
not lead to much gain, while when the bias term dominates, we can choose larger
step-sizes that leads to significant improvements