535 research outputs found

    Generalized Separable Nonnegative Matrix Factorization

    Full text link
    Nonnegative matrix factorization (NMF) is a linear dimensionality technique for nonnegative data with applications such as image analysis, text mining, audio source separation and hyperspectral unmixing. Given a data matrix MM and a factorization rank rr, NMF looks for a nonnegative matrix WW with rr columns and a nonnegative matrix HH with rr rows such that M≈WHM \approx WH. NMF is NP-hard to solve in general. However, it can be computed efficiently under the separability assumption which requires that the basis vectors appear as data points, that is, that there exists an index set K\mathcal{K} such that W=M(:,K)W = M(:,\mathcal{K}). In this paper, we generalize the separability assumption: We only require that for each rank-one factor W(:,k)H(k,:)W(:,k)H(k,:) for k=1,2,…,rk=1,2,\dots,r, either W(:,k)=M(:,j)W(:,k) = M(:,j) for some jj or H(k,:)=M(i,:)H(k,:) = M(i,:) for some ii. We refer to the corresponding problem as generalized separable NMF (GS-NMF). We discuss some properties of GS-NMF and propose a convex optimization model which we solve using a fast gradient method. We also propose a heuristic algorithm inspired by the successive projection algorithm. To verify the effectiveness of our methods, we compare them with several state-of-the-art separable NMF algorithms on synthetic, document and image data sets.Comment: 31 pages, 12 figures, 4 tables. We have added discussions about the identifiability of the model, we have modified the first synthetic experiment, we have clarified some aspects of the contributio

    Document Clustering Based On Max-Correntropy Non-Negative Matrix Factorization

    Full text link
    Nonnegative matrix factorization (NMF) has been successfully applied to many areas for classification and clustering. Commonly-used NMF algorithms mainly target on minimizing the l2l_2 distance or Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear case. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn the new basis vectors of the semantic feature space from the data. To our knowledge, we haven't seen any work has been done by maximizing correntropy in NMF to cluster high dimensional document data. Our experiment results show the supremacy of our proposed method over other variants of NMF algorithm on Reuters21578 and TDT2 databasets.Comment: International Conference of Machine Learning and Cybernetics (ICMLC) 201
    • …
    corecore