2,968 research outputs found
Computational Complexity for Physicists
These lecture notes are an informal introduction to the theory of
computational complexity and its links to quantum computing and statistical
mechanics.Comment: references updated, reprint available from
http://itp.nat.uni-magdeburg.de/~mertens/papers/complexity.shtm
Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations
Low rank matrix approximation is an important tool in machine learning. Given
a data matrix, low rank approximation helps to find factors, patterns and
provides concise representations for the data. Research on low rank
approximation usually focus on real matrices. However, in many applications
data are binary (categorical) rather than continuous. This leads to the problem
of low rank approximation of binary matrix. Here we are given a
binary matrix and a small integer . The goal is to find two binary
matrices and of sizes and respectively, so
that the Frobenius norm of is minimized. There are two models of this
problem, depending on the definition of the dot product of binary vectors: The
model and the Boolean semiring model. Unlike low rank
approximation of real matrix which can be efficiently solved by Singular Value
Decomposition, approximation of binary matrix is -hard even for .
In this paper, we consider the problem of Column Subset Selection (CSS), in
which one low rank matrix must be formed by columns of the data matrix. We
characterize the approximation ratio of CSS for binary matrices. For
model, we show the approximation ratio of CSS is bounded by
and this bound is asymptotically tight. For
Boolean model, it turns out that CSS is no longer sufficient to obtain a bound.
We then develop a Generalized CSS (GCSS) procedure in which the columns of one
low rank matrix are generated from Boolean formulas operating bitwise on
columns of the data matrix. We show the approximation ratio of GCSS is bounded
by , and the exponential dependency on is inherent.Comment: 38 page
From-Below Boolean Matrix Factorization Algorithm Based on MDL
During the past few years Boolean matrix factorization (BMF) has become an
important direction in data analysis. The minimum description length principle
(MDL) was successfully adapted in BMF for the model order selection.
Nevertheless, a BMF algorithm performing good results from the standpoint of
standard measures in BMF is missing. In this paper, we propose a novel
from-below Boolean matrix factorization algorithm based on formal concept
analysis. The algorithm utilizes the MDL principle as a criterion for the
factor selection. On various experiments we show that the proposed algorithm
outperforms---from different standpoints---existing state-of-the-art BMF
algorithms
Algorithms for Approximate Subtropical Matrix Factorization
Matrix factorization methods are important tools in data mining and analysis.
They can be used for many tasks, ranging from dimensionality reduction to
visualization. In this paper we concentrate on the use of matrix factorizations
for finding patterns from the data. Rather than using the standard algebra --
and the summation of the rank-1 components to build the approximation of the
original matrix -- we use the subtropical algebra, which is an algebra over the
nonnegative real values with the summation replaced by the maximum operator.
Subtropical matrix factorizations allow "winner-takes-it-all" interpretations
of the rank-1 components, revealing different structure than the normal
(nonnegative) factorizations. We study the complexity and sparsity of the
factorizations, and present a framework for finding low-rank subtropical
factorizations. We present two specific algorithms, called Capricorn and
Cancer, that are part of our framework. They can be used with data that has
been corrupted with different types of noise, and with different error metrics,
including the sum-of-absolute differences, Frobenius norm, and Jensen--Shannon
divergence. Our experiments show that the algorithms perform well on data that
has subtropical structure, and that they can find factorizations that are both
sparse and easy to interpret.Comment: 40 pages, 9 figures. For the associated source code, see
http://people.mpi-inf.mpg.de/~pmiettin/tropical
New developments in the theory of Groebner bases and applications to formal verification
We present foundational work on standard bases over rings and on Boolean
Groebner bases in the framework of Boolean functions. The research was
motivated by our collaboration with electrical engineers and computer
scientists on problems arising from formal verification of digital circuits. In
fact, algebraic modelling of formal verification problems is developed on the
word-level as well as on the bit-level. The word-level model leads to Groebner
basis in the polynomial ring over Z/2n while the bit-level model leads to
Boolean Groebner bases. In addition to the theoretical foundations of both
approaches, the algorithms have been implemented. Using these implementations
we show that special data structures and the exploitation of symmetries make
Groebner bases competitive to state-of-the-art tools from formal verification
but having the advantage of being systematic and more flexible.Comment: 44 pages, 8 figures, submitted to the Special Issue of the Journal of
Pure and Applied Algebr
- …