5,643 research outputs found

    Rethinking LDA: moment matching for discrete ICA

    Get PDF
    We consider moment matching techniques for estimation in Latent Dirichlet Allocation (LDA). By drawing explicit links between LDA and discrete versions of independent component analysis (ICA), we first derive a new set of cumulant-based tensors, with an improved sample complexity. Moreover, we reuse standard ICA techniques such as joint diagonalization of tensors to improve over existing methods based on the tensor power method. In an extensive set of experiments on both synthetic and real datasets, we show that our new combination of tensors and orthogonal joint diagonalization techniques outperforms existing moment matching methods.Comment: 30 pages; added plate diagrams and clarifications, changed style, corrected typos, updated figures. in Proceedings of the 29-th Conference on Neural Information Processing Systems (NIPS), 201

    Fourth Moments and Independent Component Analysis

    Full text link
    In independent component analysis it is assumed that the components of the observed random vector are linear combinations of latent independent random variables, and the aim is then to find an estimate for a transformation matrix back to these independent components. In the engineering literature, there are several traditional estimation procedures based on the use of fourth moments, such as FOBI (fourth order blind identification), JADE (joint approximate diagonalization of eigenmatrices), and FastICA, but the statistical properties of these estimates are not well known. In this paper various independent component functionals based on the fourth moments are discussed in detail, starting with the corresponding optimization problems, deriving the estimating equations and estimation algorithms, and finding asymptotic statistical properties of the estimates. Comparisons of the asymptotic variances of the estimates in wide independent component models show that in most cases JADE and the symmetric version of FastICA perform better than their competitors.Comment: Published at http://dx.doi.org/10.1214/15-STS520 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    OV Graphs Are (Probably) Hard Instances

    Get PDF
    © Josh Alman and Virginia Vassilevska Williams. A graph G on n nodes is an Orthogonal Vectors (OV) graph of dimension d if there are vectors v1, . . ., vn ∈ {0, 1}d such that nodes i and j are adjacent in G if and only if hvi, vji = 0 over Z. In this paper, we study a number of basic graph algorithm problems, except where one is given as input the vectors defining an OV graph instead of a general graph. We show that for each of the following problems, an algorithm solving it faster on such OV graphs G of dimension only d = O(log n) than in the general case would refute a plausible conjecture about the time required to solve sparse MAX-k-SAT instances: Determining whether G contains a triangle. More generally, determining whether G contains a directed k-cycle for any k ≥ 3. Computing the square of the adjacency matrix of G over Z or F2. Maintaining the shortest distance between two fixed nodes of G, or whether G has a perfect matching, when G is a dynamically updating OV graph. We also prove some complementary results about OV graphs. We show that any problem which is NP-hard on constant-degree graphs is also NP-hard on OV graphs of dimension O(log n), and we give two problems which can be solved faster on OV graphs than in general: Maximum Clique, and Online Matrix-Vector Multiplication
    corecore