20,062 research outputs found
Low-Rank Binary Matrix Approximation in Column-Sum Norm
We consider -Rank- Approximation over GF(2), where for a binary
matrix and a positive integer , one seeks a binary
matrix of rank at most , minimizing the column-sum norm . We show that for every , there is a
randomized -approximation algorithm for -Rank-
Approximation over GF(2) of running time . This is the first polynomial time approximation scheme
(PTAS) for this problem
Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations
Low rank matrix approximation is an important tool in machine learning. Given
a data matrix, low rank approximation helps to find factors, patterns and
provides concise representations for the data. Research on low rank
approximation usually focus on real matrices. However, in many applications
data are binary (categorical) rather than continuous. This leads to the problem
of low rank approximation of binary matrix. Here we are given a
binary matrix and a small integer . The goal is to find two binary
matrices and of sizes and respectively, so
that the Frobenius norm of is minimized. There are two models of this
problem, depending on the definition of the dot product of binary vectors: The
model and the Boolean semiring model. Unlike low rank
approximation of real matrix which can be efficiently solved by Singular Value
Decomposition, approximation of binary matrix is -hard even for .
In this paper, we consider the problem of Column Subset Selection (CSS), in
which one low rank matrix must be formed by columns of the data matrix. We
characterize the approximation ratio of CSS for binary matrices. For
model, we show the approximation ratio of CSS is bounded by
and this bound is asymptotically tight. For
Boolean model, it turns out that CSS is no longer sufficient to obtain a bound.
We then develop a Generalized CSS (GCSS) procedure in which the columns of one
low rank matrix are generated from Boolean formulas operating bitwise on
columns of the data matrix. We show the approximation ratio of GCSS is bounded
by , and the exponential dependency on is inherent.Comment: 38 page
Theory and implementation of -matrix based iterative and direct solvers for Helmholtz and elastodynamic oscillatory kernels
In this work, we study the accuracy and efficiency of hierarchical matrix
(-matrix) based fast methods for solving dense linear systems
arising from the discretization of the 3D elastodynamic Green's tensors. It is
well known in the literature that standard -matrix based methods,
although very efficient tools for asymptotically smooth kernels, are not
optimal for oscillatory kernels. -matrix and directional
approaches have been proposed to overcome this problem. However the
implementation of such methods is much more involved than the standard
-matrix representation. The central questions we address are
twofold. (i) What is the frequency-range in which the -matrix
format is an efficient representation for 3D elastodynamic problems? (ii) What
can be expected of such an approach to model problems in mechanical
engineering? We show that even though the method is not optimal (in the sense
that more involved representations can lead to faster algorithms) an efficient
solver can be easily developed. The capabilities of the method are illustrated
on numerical examples using the Boundary Element Method
New efficient algorithms for multiple change-point detection with kernels
Several statistical approaches based on reproducing kernels have been
proposed to detect abrupt changes arising in the full distribution of the
observations and not only in the mean or variance. Some of these approaches
enjoy good statistical properties (oracle inequality, \ldots). Nonetheless,
they have a high computational cost both in terms of time and memory. This
makes their application difficult even for small and medium sample sizes (). This computational issue is addressed by first describing a new
efficient and exact algorithm for kernel multiple change-point detection with
an improved worst-case complexity that is quadratic in time and linear in
space. It allows dealing with medium size signals (up to ).
Second, a faster but approximation algorithm is described. It is based on a
low-rank approximation to the Gram matrix. It is linear in time and space. This
approximation algorithm can be applied to large-scale signals ().
These exact and approximation algorithms have been implemented in \texttt{R}
and \texttt{C} for various kernels. The computational and statistical
performances of these new algorithms have been assessed through empirical
experiments. The runtime of the new algorithms is observed to be faster than
that of other considered procedures. Finally, simulations confirmed the higher
statistical accuracy of kernel-based approaches to detect changes that are not
only in the mean. These simulations also illustrate the flexibility of
kernel-based approaches to analyze complex biological profiles made of DNA copy
number and allele B frequencies. An R package implementing the approach will be
made available on github
Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization
The affine rank minimization problem consists of finding a matrix of minimum
rank that satisfies a given system of linear equality constraints. Such
problems have appeared in the literature of a diverse set of fields including
system identification and control, Euclidean embedding, and collaborative
filtering. Although specific instances can often be solved with specialized
algorithms, the general affine rank minimization problem is NP-hard. In this
paper, we show that if a certain restricted isometry property holds for the
linear transformation defining the constraints, the minimum rank solution can
be recovered by solving a convex optimization problem, namely the minimization
of the nuclear norm over the given affine space. We present several random
ensembles of equations where the restricted isometry property holds with
overwhelming probability. The techniques used in our analysis have strong
parallels in the compressed sensing framework. We discuss how affine rank
minimization generalizes this pre-existing concept and outline a dictionary
relating concepts from cardinality minimization to those of rank minimization
- …