2,968 research outputs found

    Computational Complexity for Physicists

    Full text link
    These lecture notes are an informal introduction to the theory of computational complexity and its links to quantum computing and statistical mechanics.Comment: references updated, reprint available from http://itp.nat.uni-magdeburg.de/~mertens/papers/complexity.shtm

    Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations

    Get PDF
    Low rank matrix approximation is an important tool in machine learning. Given a data matrix, low rank approximation helps to find factors, patterns and provides concise representations for the data. Research on low rank approximation usually focus on real matrices. However, in many applications data are binary (categorical) rather than continuous. This leads to the problem of low rank approximation of binary matrix. Here we are given a d×nd \times n binary matrix AA and a small integer kk. The goal is to find two binary matrices UU and VV of sizes d×kd \times k and k×nk \times n respectively, so that the Frobenius norm of A−UVA - U V is minimized. There are two models of this problem, depending on the definition of the dot product of binary vectors: The GF(2)\mathrm{GF}(2) model and the Boolean semiring model. Unlike low rank approximation of real matrix which can be efficiently solved by Singular Value Decomposition, approximation of binary matrix is NPNP-hard even for k=1k=1. In this paper, we consider the problem of Column Subset Selection (CSS), in which one low rank matrix must be formed by kk columns of the data matrix. We characterize the approximation ratio of CSS for binary matrices. For GF(2)GF(2) model, we show the approximation ratio of CSS is bounded by k2+1+k2(2k−1)\frac{k}{2}+1+\frac{k}{2(2^k-1)} and this bound is asymptotically tight. For Boolean model, it turns out that CSS is no longer sufficient to obtain a bound. We then develop a Generalized CSS (GCSS) procedure in which the columns of one low rank matrix are generated from Boolean formulas operating bitwise on columns of the data matrix. We show the approximation ratio of GCSS is bounded by 2k−1+12^{k-1}+1, and the exponential dependency on kk is inherent.Comment: 38 page

    From-Below Boolean Matrix Factorization Algorithm Based on MDL

    Full text link
    During the past few years Boolean matrix factorization (BMF) has become an important direction in data analysis. The minimum description length principle (MDL) was successfully adapted in BMF for the model order selection. Nevertheless, a BMF algorithm performing good results from the standpoint of standard measures in BMF is missing. In this paper, we propose a novel from-below Boolean matrix factorization algorithm based on formal concept analysis. The algorithm utilizes the MDL principle as a criterion for the factor selection. On various experiments we show that the proposed algorithm outperforms---from different standpoints---existing state-of-the-art BMF algorithms

    Algorithms for Approximate Subtropical Matrix Factorization

    Get PDF
    Matrix factorization methods are important tools in data mining and analysis. They can be used for many tasks, ranging from dimensionality reduction to visualization. In this paper we concentrate on the use of matrix factorizations for finding patterns from the data. Rather than using the standard algebra -- and the summation of the rank-1 components to build the approximation of the original matrix -- we use the subtropical algebra, which is an algebra over the nonnegative real values with the summation replaced by the maximum operator. Subtropical matrix factorizations allow "winner-takes-it-all" interpretations of the rank-1 components, revealing different structure than the normal (nonnegative) factorizations. We study the complexity and sparsity of the factorizations, and present a framework for finding low-rank subtropical factorizations. We present two specific algorithms, called Capricorn and Cancer, that are part of our framework. They can be used with data that has been corrupted with different types of noise, and with different error metrics, including the sum-of-absolute differences, Frobenius norm, and Jensen--Shannon divergence. Our experiments show that the algorithms perform well on data that has subtropical structure, and that they can find factorizations that are both sparse and easy to interpret.Comment: 40 pages, 9 figures. For the associated source code, see http://people.mpi-inf.mpg.de/~pmiettin/tropical

    New developments in the theory of Groebner bases and applications to formal verification

    Get PDF
    We present foundational work on standard bases over rings and on Boolean Groebner bases in the framework of Boolean functions. The research was motivated by our collaboration with electrical engineers and computer scientists on problems arising from formal verification of digital circuits. In fact, algebraic modelling of formal verification problems is developed on the word-level as well as on the bit-level. The word-level model leads to Groebner basis in the polynomial ring over Z/2n while the bit-level model leads to Boolean Groebner bases. In addition to the theoretical foundations of both approaches, the algorithms have been implemented. Using these implementations we show that special data structures and the exploitation of symmetries make Groebner bases competitive to state-of-the-art tools from formal verification but having the advantage of being systematic and more flexible.Comment: 44 pages, 8 figures, submitted to the Special Issue of the Journal of Pure and Applied Algebr

    TR-2012003: Root-Finding and Root-Refining for a Polynomial Equation

    Full text link
    • …
    corecore