2,129 research outputs found

    Adaptive Monotone Shrinkage for Regression

    Full text link
    We develop an adaptive monotone shrinkage estimator for regression models with the following characteristics: i) dense coefficients with small but important effects; ii) a priori ordering that indicates the probable predictive importance of the features. We capture both properties with an empirical Bayes estimator that shrinks coefficients monotonically with respect to their anticipated importance. This estimator can be rapidly computed using a version of Pool-Adjacent-Violators algorithm. We show that the proposed monotone shrinkage approach is competitive with the class of all Bayesian estimators that share the prior information. We further observe that the estimator also minimizes Stein's unbiased risk estimate. Along with our key result that the estimator mimics the oracle Bayes rule under an order assumption, we also prove that the estimator is robust. Even without the order assumption, our estimator mimics the best performance of a large family of estimators that includes the least squares estimator, constant-λ\lambda ridge estimator, James-Stein estimator, etc. All the theoretical results are non-asymptotic. Simulation results and data analysis from a model for text processing are provided to support the theory.Comment: Appearing in Uncertainty in Artificial Intelligence (UAI) 201

    Hybrid Shrinkage Estimators Using Penalty Bases For The Ordinal One-Way Layout

    Full text link
    This paper constructs improved estimators of the means in the Gaussian saturated one-way layout with an ordinal factor. The least squares estimator for the mean vector in this saturated model is usually inadmissible. The hybrid shrinkage estimators of this paper exploit the possibility of slow variation in the dependence of the means on the ordered factor levels but do not assume it and respond well to faster variation if present. To motivate the development, candidate penalized least squares (PLS) estimators for the mean vector of a one-way layout are represented as shrinkage estimators relative to the penalty basis for the regression space. This canonical representation suggests further classes of candidate estimators for the unknown means: monotone shrinkage (MS) estimators or soft-thresholding (ST) estimators or, most generally, hybrid shrinkage (HS) estimators that combine the preceding two strategies. Adaptation selects the estimator within a candidate class that minimizes estimated risk. Under the Gaussian saturated one-way layout model, such adaptive estimators minimize risk asymptotically over the class of candidate estimators as the number of factor levels tends to infinity. Thereby, adaptive HS estimators asymptotically dominate adaptive MS and adaptive ST estimators as well as the least squares estimator. Local annihilators of polynomials, among them difference operators, generate penalty bases suitable for a range of numerical examples.Comment: Published at http://dx.doi.org/10.1214/009053604000000652 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    LASSO ISOtone for High Dimensional Additive Isotonic Regression

    Full text link
    Additive isotonic regression attempts to determine the relationship between a multi-dimensional observation variable and a response, under the constraint that the estimate is the additive sum of univariate component effects that are monotonically increasing. In this article, we present a new method for such regression called LASSO Isotone (LISO). LISO adapts ideas from sparse linear modelling to additive isotonic regression. Thus, it is viable in many situations with high dimensional predictor variables, where selection of significant versus insignificant variables are required. We suggest an algorithm involving a modification of the backfitting algorithm CPAV. We give a numerical convergence result, and finally examine some of its properties through simulations. We also suggest some possible extensions that improve performance, and allow calculation to be carried out when the direction of the monotonicity is unknown

    FAASTA: A fast solver for total-variation regularization of ill-conditioned problems with application to brain imaging

    Get PDF
    The total variation (TV) penalty, as many other analysis-sparsity problems, does not lead to separable factors or a proximal operatorwith a closed-form expression, such as soft thresholding for the _1\ell\_1 penalty. As a result, in a variational formulation of an inverse problem or statisticallearning estimation, it leads to challenging non-smooth optimization problemsthat are often solved with elaborate single-step first-order methods. When thedata-fit term arises from empirical measurements, as in brain imaging, it isoften very ill-conditioned and without simple structure. In this situation, in proximal splitting methods, the computation cost of thegradient step can easily dominate each iteration. Thus it is beneficialto minimize the number of gradient steps.We present fAASTA, a variant of FISTA, that relies on an internal solver forthe TV proximal operator, and refines its tolerance to balance computationalcost of the gradient and the proximal steps. We give benchmarks andillustrations on "brain decoding": recovering brain maps from noisymeasurements to predict observed behavior. The algorithm as well as theempirical study of convergence speed are valuable for any non-exact proximaloperator, in particular analysis-sparsity problems
    corecore