In data-driven stochastic optimization, model parameters of the underlying
distribution need to be estimated from data in addition to the optimization
task. Recent literature suggests the integration of the estimation and
optimization processes, by selecting model parameters that lead to the best
empirical objective performance. Such an integrated approach can be readily
shown to outperform simple ``estimate then optimize" when the model is
misspecified. In this paper, we argue that when the model class is rich enough
to cover the ground truth, the performance ordering between the two approaches
is reversed for nonlinear problems in a strong sense. Simple ``estimate then
optimize" outperforms the integrated approach in terms of stochastic dominance
of the asymptotic optimality gap, i,e, the mean, all other moments, and the
entire asymptotic distribution of the optimality gap is always better.
Analogous results also hold under constrained settings and when contextual
features are available. We also provide experimental findings to support our
theory