We consider the problem of choosing between several models in least-squares
regression with heteroscedastic data. We prove that any penalization procedure
is suboptimal when the penalty is a function of the dimension of the model, at
least for some typical heteroscedastic model selection problems. In particular,
Mallows' Cp is suboptimal in this framework. On the contrary, optimal model
selection is possible with data-driven penalties such as resampling or V-fold
penalties. Therefore, it is worth estimating the shape of the penalty from
data, even at the price of a higher computational cost. Simulation experiments
illustrate the existence of a trade-off between statistical accuracy and
computational complexity. As a conclusion, we sketch some rules for choosing a
penalty in least-squares regression, depending on what is known about possible
variations of the noise-level