1,198 research outputs found
On the Quality of a Semidefinite Programming Bound for Sparse Principal Component Analysis
We examine the problem of approximating a positive, semidefinite matrix
by a dyad , with a penalty on the cardinality of the vector .
This problem arises in sparse principal component analysis, where a
decomposition of involving sparse factors is sought. We express this
hard, combinatorial problem as a maximum eigenvalue problem, in which we seek
to maximize, over a box, the largest eigenvalue of a symmetric matrix that is
linear in the variables. This representation allows to use the techniques of
robust optimization, to derive a bound based on semidefinite programming. The
quality of the bound is investigated using a technique inspired by Nemirovski
and Ben-Tal (2002).Comment: 13 pages, 3 figures This new version corresponds to an extensive
revision of the earlier versio
A Semidefinite Relaxation for Air Traffic Flow Scheduling
We first formulate the problem of optimally scheduling air traffic low with
sector capacity constraints as a mixed integer linear program. We then use
semidefinite relaxation techniques to form a convex relaxation of that problem.
Finally, we present a randomization algorithm to further improve the quality of
the solution. Because of the specific structure of the air traffic flow
problem, the relaxation has a single semidefinite constraint of size dn where d
is the maximum delay and n the number of flights.Comment: Submitted to RIVF 200
Static Arbitrage Bounds on Basket Option Prices
We consider the problem of computing upper and lower bounds on the price of a
European basket call option, given prices on other similar baskets. Although
this problem is very hard to solve exactly in the general case, we show that in
some instances the upper and lower bounds can be computed via simple
closed-form expressions, or linear programs. We also introduce an efficient
linear programming relaxation of the general problem based on an integral
transform interpretation of the call price function. We show that this
relaxation is tight in some of the special cases examined before.Comment: To Appear in Mathematical Programming, Series
Acknowledgement Research supported in part by NSF grant DMS-0625371. A Convex Upper Bound on the Log-Partition Function for Binary Graphical Models ∗
All rights reserved
Sparse Covariance Selection via Robust Maximum Likelihood Estimation
We address a problem of covariance selection, where we seek a trade-off
between a high likelihood against the number of non-zero elements in the
inverse covariance matrix. We solve a maximum likelihood problem with a penalty
term given by the sum of absolute values of the elements of the inverse
covariance matrix, and allow for imposing bounds on the condition number of the
solution. The problem is directly amenable to now standard interior-point
algorithms for convex optimization, but remains challenging due to its size. We
first give some results on the theoretical computational complexity of the
problem, by showing that a recent methodology for non-smooth convex
optimization due to Nesterov can be applied to this problem, to greatly improve
on the complexity estimate given by interior-point algorithms. We then examine
two practical algorithms aimed at solving large-scale, noisy (hence dense)
instances: one is based on a block-coordinate descent approach, where columns
and rows are updated sequentially, another applies a dual version of Nesterov's
method.Comment: Submitted to NIPS 200
Safe Feature Elimination for the LASSO and Sparse Supervised Learning Problems
We describe a fast method to eliminate features (variables) in l1 -penalized
least-square regression (or LASSO) problems. The elimination of features leads
to a potentially substantial reduction in running time, specially for large
values of the penalty parameter. Our method is not heuristic: it only
eliminates features that are guaranteed to be absent after solving the LASSO
problem. The feature elimination step is easy to parallelize and can test each
feature for elimination independently. Moreover, the computational effort of
our method is negligible compared to that of solving the LASSO problem -
roughly it is the same as single gradient step. Our method extends the scope of
existing LASSO algorithms to treat larger data sets, previously out of their
reach. We show how our method can be extended to general l1 -penalized convex
problems and present preliminary results for the Sparse Support Vector Machine
and Logistic Regression problems.Comment: Submitted to JMLR in April 201
Optimal Solutions for Sparse Principal Component Analysis
Given a sample covariance matrix, we examine the problem of maximizing the
variance explained by a linear combination of the input variables while
constraining the number of nonzero coefficients in this combination. This is
known as sparse principal component analysis and has a wide array of
applications in machine learning and engineering. We formulate a new
semidefinite relaxation to this problem and derive a greedy algorithm that
computes a full set of good solutions for all target numbers of non zero
coefficients, with total complexity O(n^3), where n is the number of variables.
We then use the same relaxation to derive sufficient conditions for global
optimality of a solution, which can be tested in O(n^3) per pattern. We discuss
applications in subset selection and sparse recovery and show on artificial
examples and biological data that our algorithm does provide globally optimal
solutions in many cases.Comment: Revised journal version. More efficient optimality conditions and new
examples in subset selection and sparse recovery. Original version is in ICML
proceeding
Model Selection Through Sparse Maximum Likelihood Estimation
We consider the problem of estimating the parameters of a Gaussian or binary
distribution in such a way that the resulting undirected graphical model is
sparse. Our approach is to solve a maximum likelihood problem with an added
l_1-norm penalty term. The problem as formulated is convex but the memory
requirements and complexity of existing interior point methods are prohibitive
for problems with more than tens of nodes. We present two new algorithms for
solving problems with at least a thousand nodes in the Gaussian case. Our first
algorithm uses block coordinate descent, and can be interpreted as recursive
l_1-norm penalized regression. Our second algorithm, based on Nesterov's first
order method, yields a complexity estimate with a better dependence on problem
size than existing interior point methods. Using a log determinant relaxation
of the log partition function (Wainwright & Jordan (2006)), we show that these
same algorithms can be used to solve an approximate sparse maximum likelihood
problem for the binary case. We test our algorithms on synthetic data, as well
as on gene expression and senate voting records data
First-order methods for sparse covariance selection
Given a sample covariance matrix, we solve a maximum likelihood problem
penalized by the number of nonzero coefficients in the inverse covariance
matrix. Our objective is to find a sparse representation of the sample data and
to highlight conditional independence relationships between the sample
variables. We first formulate a convex relaxation of this combinatorial
problem, we then detail two efficient first-order algorithms with low memory
requirements to solve large-scale, dense problem instances
Robust sketching for multiple square-root LASSO problems
Many learning tasks, such as cross-validation, parameter search, or
leave-one-out analysis, involve multiple instances of similar problems, each
instance sharing a large part of learning data with the others. We introduce a
robust framework for solving multiple square-root LASSO problems, based on a
sketch of the learning data that uses low-rank approximations. Our approach
allows a dramatic reduction in computational effort, in effect reducing the
number of observations from (the number of observations to start with) to
(the number of singular values retained in the low-rank model), while not
sacrificing---sometimes even improving---the statistical performance.
Theoretical analysis, as well as numerical experiments on both synthetic and
real data, illustrate the efficiency of the method in large scale applications
- …