49,844 research outputs found
Local-Aggregate Modeling for Big-Data via Distributed Optimization: Applications to Neuroimaging
Technological advances have led to a proliferation of structured big data
that have matrix-valued covariates. We are specifically motivated to build
predictive models for multi-subject neuroimaging data based on each subject's
brain imaging scans. This is an ultra-high-dimensional problem that consists of
a matrix of covariates (brain locations by time points) for each subject; few
methods currently exist to fit supervised models directly to this tensor data.
We propose a novel modeling and algorithmic strategy to apply generalized
linear models (GLMs) to this massive tensor data in which one set of variables
is associated with locations. Our method begins by fitting GLMs to each
location separately, and then builds an ensemble by blending information across
locations through regularization with what we term an aggregating penalty. Our
so called, Local-Aggregate Model, can be fit in a completely distributed manner
over the locations using an Alternating Direction Method of Multipliers (ADMM)
strategy, and thus greatly reduces the computational burden. Furthermore, we
propose to select the appropriate model through a novel sequence of faster
algorithmic solutions that is similar to regularization paths. We will
demonstrate both the computational and predictive modeling advantages of our
methods via simulations and an EEG classification problem.Comment: 41 pages, 5 figures and 3 table
A second derivative SQP method: theoretical issues
Sequential quadratic programming (SQP) methods form a class of highly efficient algorithms for solving nonlinearly constrained optimization problems. Although second derivative information may often be calculated, there is little practical theory that justifies exact-Hessian SQP methods. In particular, the resulting quadratic programming (QP) subproblems are often nonconvex, and thus finding their global solutions may be computationally nonviable. This paper presents a second-derivative SQP method based on quadratic subproblems that are either convex, and thus may be solved efficiently, or need not be solved globally. Additionally, an explicit descent-constraint is imposed on certain QP subproblems, which “guides” the iterates through areas in which nonconvexity is a concern. Global convergence of the resulting algorithm is established
A second derivative SQP method: local convergence
In [19], we gave global convergence results for a second-derivative SQP method for minimizing the exact ℓ1-merit function for a fixed value of the penalty parameter. To establish this result, we used the properties of the so-called Cauchy step, which was itself computed from the so-called predictor step. In addition, we allowed for the computation of a variety of (optional) SQP steps that were intended to improve the efficiency of the algorithm. \ud
\ud
Although we established global convergence of the algorithm, we did not discuss certain aspects that are critical when developing software capable of solving general optimization problems. In particular, we must have strategies for updating the penalty parameter and better techniques for defining the positive-definite matrix Bk used in computing the predictor step. In this paper we address both of these issues. We consider two techniques for defining the positive-definite matrix Bk—a simple diagonal approximation and a more sophisticated limited-memory BFGS update. We also analyze a strategy for updating the penalty paramter based on approximately minimizing the ℓ1-penalty function over a sequence of increasing values of the penalty parameter.\ud
\ud
Algorithms based on exact penalty functions have certain desirable properties. To be practical, however, these algorithms must be guaranteed to avoid the so-called Maratos effect. We show that a nonmonotone varient of our algorithm avoids this phenomenon and, therefore, results in asymptotically superlinear local convergence; this is verified by preliminary numerical results on the Hock and Shittkowski test set
Exact block-wise optimization in group lasso and sparse group lasso for linear regression
The group lasso is a penalized regression method, used in regression problems
where the covariates are partitioned into groups to promote sparsity at the
group level. Existing methods for finding the group lasso estimator either use
gradient projection methods to update the entire coefficient vector
simultaneously at each step, or update one group of coefficients at a time
using an inexact line search to approximate the optimal value for the group of
coefficients when all other groups' coefficients are fixed. We present a new
method of computation for the group lasso in the linear regression case, the
Single Line Search (SLS) algorithm, which operates by computing the exact
optimal value for each group (when all other coefficients are fixed) with one
univariate line search. We perform simulations demonstrating that the SLS
algorithm is often more efficient than existing computational methods. We also
extend the SLS algorithm to the sparse group lasso problem via the Signed
Single Line Search (SSLS) algorithm, and give theoretical results to support
both algorithms.Comment: We have been made aware of the earlier work by Puig et al. (2009)
which derives the same result for the (non-sparse) group lasso setting. We
leave this manuscript available as a technical report, to serve as a
reference for the previously untreated sparse group lasso case, and for
timing comparisons of various methods in the group lasso setting. The
manuscript is updated to include this referenc
- …