We develop an adaptive monotone shrinkage estimator for regression models
with the following characteristics: i) dense coefficients with small but
important effects; ii) a priori ordering that indicates the probable predictive
importance of the features. We capture both properties with an empirical Bayes
estimator that shrinks coefficients monotonically with respect to their
anticipated importance. This estimator can be rapidly computed using a version
of Pool-Adjacent-Violators algorithm. We show that the proposed monotone
shrinkage approach is competitive with the class of all Bayesian estimators
that share the prior information. We further observe that the estimator also
minimizes Stein's unbiased risk estimate. Along with our key result that the
estimator mimics the oracle Bayes rule under an order assumption, we also prove
that the estimator is robust. Even without the order assumption, our estimator
mimics the best performance of a large family of estimators that includes the
least squares estimator, constant-λ ridge estimator, James-Stein
estimator, etc. All the theoretical results are non-asymptotic. Simulation
results and data analysis from a model for text processing are provided to
support the theory.Comment: Appearing in Uncertainty in Artificial Intelligence (UAI) 201