Search CORE

368,706 research outputs found

A note on column subset selection

Author: Youssef Pierre
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/12/2012
Field of study

Given a matrix U, using a deterministic method, we extract a "large" submatrix of U'(whose columns are obtained by normalizing those of U) and estimate its smallest and largest singular value. We apply this result to the study of contact points of the unit ball with its maximal volume ellipsoid. We consider also the paving problem and give a deterministic algorithm to partition a matrix into almost isometric blocks recovering previous results of Bourgain-Tzafriri and Tropp. Finally, we partially answer a question raised by Naor about finding an algorithm in the spirit of Batson-Spielman-Srivastava's work to extract a "large" square submatrix of "small" norm.Comment: 12 pages International Mathematics Research Notices, 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL - UPEC / UPEM

On the reliable and flexible solution of practical subset regression problems

Author: Verhaegen M. H.
Publication venue
Publication date
Field of study

A new algorithm for solving subset regression problems is described. The algorithm performs a QR decomposition with a new column-pivoting strategy, which permits subset selection directly from the originally defined regression parameters. This, in combination with a number of extensions of the new technique, makes the method a very flexible tool for analyzing subset regression problems in which the parameters have a physical meaning

NASA Technical Reports Server

Branch and bound method for regression-based controlled variable selection

Author: Cao Yi
Kariwala Vinay
Ye Lingjian
Publication venue: 'Elsevier BV'
Publication date: 01/07/2013
Field of study

Self-optimizing control is a promising method for selection of controlled variables (CVs) from available measurements. Recently, Ye, Cao, Li, and Song (2012) have proposed a globally optimal method for selection of self-optimizing CVs by converting the CV selection problem into a regression problem. In this approach, the necessary conditions of optimality (NCO) are approximated by linear combinations of available measurements over the entire operation region. In practice, it is desired that a subset of available measurements be combined as CVs to obtain a good trade-off between the economic performance and the complexity of control system. The subset selection problem, however, is combinatorial in nature, which makes the application of the globally optimal CV selection method to large-scale processes difficult. In this work, an efficient branch and bound (BAB) algorithm is developed to handle the computational complexity associated with the selection of globally optimal CVs. The proposed BAB algorithm identifies the best measurement subset such that the regression error in approximating NCO is minimized and is also applicable to the general regression problem. Numerical tests using randomly generated matrices and a binary distillation column case study demonstrate the computational efficiency of the proposed BAB algorithm

Crossref

CERES Research Repository (Cranfield Univ.)

Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations

Author: Dan Chen
Hansen Kristoffer Arnsfelt
Jiang He
Wang Liwei
Zhou Yuchen
Publication venue
Publication date: 20/04/2017
Field of study

Low rank matrix approximation is an important tool in machine learning. Given a data matrix, low rank approximation helps to find factors, patterns and provides concise representations for the data. Research on low rank approximation usually focus on real matrices. However, in many applications data are binary (categorical) rather than continuous. This leads to the problem of low rank approximation of binary matrix. Here we are given a

d \times n

binary matrix

A

and a small integer

k

. The goal is to find two binary matrices

U

and

V

of sizes

d \times k

and

k \times n

respectively, so that the Frobenius norm of

A - U V

is minimized. There are two models of this problem, depending on the definition of the dot product of binary vectors: The

\mathrm{GF}(2)

model and the Boolean semiring model. Unlike low rank approximation of real matrix which can be efficiently solved by Singular Value Decomposition, approximation of binary matrix is

NP

-hard even for

k=1

. In this paper, we consider the problem of Column Subset Selection (CSS), in which one low rank matrix must be formed by

k

columns of the data matrix. We characterize the approximation ratio of CSS for binary matrices. For

GF(2)

model, we show the approximation ratio of CSS is bounded by

\frac{k}{2}+1+\frac{k}{2(2^k-1)}

and this bound is asymptotically tight. For Boolean model, it turns out that CSS is no longer sufficient to obtain a bound. We then develop a Generalized CSS (GCSS) procedure in which the columns of one low rank matrix are generated from Boolean formulas operating bitwise on columns of the data matrix. We show the approximation ratio of GCSS is bounded by

2^{k-1}+1

, and the exponential dependency on

k

is inherent.Comment: 38 page

arXiv.org e-Print Archive

DROPS Dagstuhl Research Online Publication Server