Symmetric Sparse Boolean Matrix Factorization and Applications

Abstract

In this work, we study a variant of nonnegative matrix factorization where we wish to find a symmetric factorization of a given input matrix into a sparse, Boolean matrix. Formally speaking, given MZm×m\mathbf{M}\in\mathbb{Z}^{m\times m}, we want to find W{0,1}m×r\mathbf{W}\in\{0,1\}^{m\times r} such that MWW0\| \mathbf{M} - \mathbf{W}\mathbf{W}^\top \|_0 is minimized among all W\mathbf{W} for which each row is kk-sparse. This question turns out to be closely related to a number of questions like recovering a hypergraph from its line graph, as well as reconstruction attacks for private neural network training. As this problem is hard in the worst-case, we study a natural average-case variant that arises in the context of these reconstruction attacks: M=WW\mathbf{M} = \mathbf{W}\mathbf{W}^{\top} for W\mathbf{W} a random Boolean matrix with kk-sparse rows, and the goal is to recover W\mathbf{W} up to column permutation. Equivalently, this can be thought of as recovering a uniformly random kk-uniform hypergraph from its line graph. Our main result is a polynomial-time algorithm for this problem based on bootstrapping higher-order information about W\mathbf{W} and then decomposing an appropriate tensor. The key ingredient in our analysis, which may be of independent interest, is to show that such a matrix W\mathbf{W} has full column rank with high probability as soon as m=Ω~(r)m = \widetilde{\Omega}(r), which we do using tools from Littlewood-Offord theory and estimates for binary Krawtchouk polynomials.Comment: 33 pages, to appear in Innovations in Theoretical Computer Science (ITCS 2022), v2: updated ref

    Similar works