2,129 research outputs found
Adaptive Monotone Shrinkage for Regression
We develop an adaptive monotone shrinkage estimator for regression models
with the following characteristics: i) dense coefficients with small but
important effects; ii) a priori ordering that indicates the probable predictive
importance of the features. We capture both properties with an empirical Bayes
estimator that shrinks coefficients monotonically with respect to their
anticipated importance. This estimator can be rapidly computed using a version
of Pool-Adjacent-Violators algorithm. We show that the proposed monotone
shrinkage approach is competitive with the class of all Bayesian estimators
that share the prior information. We further observe that the estimator also
minimizes Stein's unbiased risk estimate. Along with our key result that the
estimator mimics the oracle Bayes rule under an order assumption, we also prove
that the estimator is robust. Even without the order assumption, our estimator
mimics the best performance of a large family of estimators that includes the
least squares estimator, constant- ridge estimator, James-Stein
estimator, etc. All the theoretical results are non-asymptotic. Simulation
results and data analysis from a model for text processing are provided to
support the theory.Comment: Appearing in Uncertainty in Artificial Intelligence (UAI) 201
Hybrid Shrinkage Estimators Using Penalty Bases For The Ordinal One-Way Layout
This paper constructs improved estimators of the means in the Gaussian
saturated one-way layout with an ordinal factor. The least squares estimator
for the mean vector in this saturated model is usually inadmissible. The hybrid
shrinkage estimators of this paper exploit the possibility of slow variation in
the dependence of the means on the ordered factor levels but do not assume it
and respond well to faster variation if present. To motivate the development,
candidate penalized least squares (PLS) estimators for the mean vector of a
one-way layout are represented as shrinkage estimators relative to the penalty
basis for the regression space. This canonical representation suggests further
classes of candidate estimators for the unknown means: monotone shrinkage (MS)
estimators or soft-thresholding (ST) estimators or, most generally, hybrid
shrinkage (HS) estimators that combine the preceding two strategies. Adaptation
selects the estimator within a candidate class that minimizes estimated risk.
Under the Gaussian saturated one-way layout model, such adaptive estimators
minimize risk asymptotically over the class of candidate estimators as the
number of factor levels tends to infinity. Thereby, adaptive HS estimators
asymptotically dominate adaptive MS and adaptive ST estimators as well as the
least squares estimator. Local annihilators of polynomials, among them
difference operators, generate penalty bases suitable for a range of numerical
examples.Comment: Published at http://dx.doi.org/10.1214/009053604000000652 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
LASSO ISOtone for High Dimensional Additive Isotonic Regression
Additive isotonic regression attempts to determine the relationship between a
multi-dimensional observation variable and a response, under the constraint
that the estimate is the additive sum of univariate component effects that are
monotonically increasing. In this article, we present a new method for such
regression called LASSO Isotone (LISO). LISO adapts ideas from sparse linear
modelling to additive isotonic regression. Thus, it is viable in many
situations with high dimensional predictor variables, where selection of
significant versus insignificant variables are required. We suggest an
algorithm involving a modification of the backfitting algorithm CPAV. We give a
numerical convergence result, and finally examine some of its properties
through simulations. We also suggest some possible extensions that improve
performance, and allow calculation to be carried out when the direction of the
monotonicity is unknown
FAASTA: A fast solver for total-variation regularization of ill-conditioned problems with application to brain imaging
The total variation (TV) penalty, as many other analysis-sparsity problems,
does not lead to separable factors or a proximal operatorwith a closed-form
expression, such as soft thresholding for the penalty. As a result,
in a variational formulation of an inverse problem or statisticallearning
estimation, it leads to challenging non-smooth optimization problemsthat are
often solved with elaborate single-step first-order methods. When thedata-fit
term arises from empirical measurements, as in brain imaging, it isoften very
ill-conditioned and without simple structure. In this situation, in proximal
splitting methods, the computation cost of thegradient step can easily dominate
each iteration. Thus it is beneficialto minimize the number of gradient
steps.We present fAASTA, a variant of FISTA, that relies on an internal solver
forthe TV proximal operator, and refines its tolerance to balance
computationalcost of the gradient and the proximal steps. We give benchmarks
andillustrations on "brain decoding": recovering brain maps from
noisymeasurements to predict observed behavior. The algorithm as well as
theempirical study of convergence speed are valuable for any non-exact
proximaloperator, in particular analysis-sparsity problems
- …