54 research outputs found

    Blind Source Separation by Nonnegative Matrix Factorization with Minimum-Volume Constraint

    Full text link
    Recently, nonnegative matrix factorization (NMF) attracts more and more attentions for the promising of wide applications. A problem that still remains is that, however, the factors resulted from it may not necessarily be realistically interpretable. Some constraints are usually added to the standard NMF to generate such interpretive results. In this paper, a minimum-volume constrained NMF is proposed and an efficient multiplicative update algorithm is developed based on the natural gradient optimization. The proposed method can be applied to the blind source separation (BSS) problem, a hot topic with many potential applications, especially if the sources are mutually dependent. Simulation results of BSS for images show the superiority of the proposed method

    Non-negative matrix factorization using posrank-based approximation decompositions

    Get PDF
    The present work addresses a particular issue related to the nonnegative factorisation of a matrix (NMF). When NMF is formulated as a nonlinear programming optimisation problem some algebraic properties concerning the dimensionality of the factorisation arise as especially important for the numerical resolution. Its importance comes in the form of a guarantee to obtain good quality approximations to the solutions of signal processing image problems. The focus of this work lies in the importance of the rank of the factor matrices, especially in the so-called posrank of the factorisation. We report computational tests that favor the conclusion that the value of the posrank has an important impact on the quality of the images recovered from the decomposition.info:eu-repo/semantics/acceptedVersio

    Block Coordinate Descent for Sparse NMF

    Get PDF
    Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data analysis. An important variant is the sparse NMF problem which arises when we explicitly require the learnt features to be sparse. A natural measure of sparsity is the L0_0 norm, however its optimization is NP-hard. Mixed norms, such as L1_1/L2_2 measure, have been shown to model sparsity robustly, based on intuitive attributes that such measures need to satisfy. This is in contrast to computationally cheaper alternatives such as the plain L1_1 norm. However, present algorithms designed for optimizing the mixed norm L1_1/L2_2 are slow and other formulations for sparse NMF have been proposed such as those based on L1_1 and L0_0 norms. Our proposed algorithm allows us to solve the mixed norm sparsity constraints while not sacrificing computation time. We present experimental evidence on real-world datasets that shows our new algorithm performs an order of magnitude faster compared to the current state-of-the-art solvers optimizing the mixed norm and is suitable for large-scale datasets

    Matrix Factorization for Learning Metagenomic Pathways and Species

    Get PDF
    This work considers learning meaningful sets of chemical reactions called pathways and groups of species called Operational Taxonomical Units (OTUs) from metagenomic data. The methods are based on Nonnegative Matrix Factorization (NMF). The rows of our data matrix correspond to metagenomic samples and columns correspond to chemical reactions present in the samples. In order to learn both pathways and OTUs as well as relationships between them, we consider ways to factorize the data matrix into three factors instead of two. Denoting the samples times reactions data matrix by V, our factorization problem setting is to find nonnegative matrices W, H and P so that V is approximately WHP. The matrix W tells what OTUs are present in each of the samples, P defines pathways as combinations of reactions while H describes what pathways are implemented by which OTUs. We first discuss two standard NMF algorithms based on different objective functions and four sparsity constrained variants. Sparsity constrained variants are designed to produce output matrices with few values significantly above zero. We are interested in sparser variants because metagenomic pathways are short, thus the method should find a representation where only a small set of reactions is present in each pathway. We describe how using a standard two-factor NMF method twice yields a three-factor representation. We briefly mention an existing method, Nonnegative Matrix Tri-factorization (NMTF), that learns all three matrices W, H and P simultaneously. However, this method applies hard orthogonality constraints, i.e. it only finds solutions where the matrices W and P are orthogonal. Because of this constraint, NMTF is not suitable in our biological problem setting. We introduce an unconstrained method called NMF3 as well as a sparsity constrained variant SNMF3 based on Sparse Nonnegative Matrix Factorization (SNMF) and show how both of these algorithms can be derived. In order to compare the different algorithms' performance, we have built two synthetic data sets. Both sets are based on human intestinal species and pathway information available in an existing biological database. One of the data matrices can be exactly factorized into the underlying matrices used to generate the data. The other data set is built through simulating a sampling process that introduces noise and strictly limits the number of observed reactions per sample. We tested factorization methods discussed in the thesis on both data sets, using 100 to 1500 samples. We compare the methods and show and discuss the results. We found differences between NMF variants that use different objective functions. Many methods perform well on our task, surprisingly even in the case where the number of pathways is greater than the number of samples. Varying the number of samples affected the results less than we expected. Instead, we found that all algorithms performed significantly better on the factorizable data than on the simulated set.We conclude that the number of available metagenomic samples does not dramatically affect the performance of the factorization methods. More important is the quality of the samples

    Ampullary cancers harbor ELF3 tumor suppressor gene mutations and exhibit frequent WNT dysregulation

    Get PDF
    The ampulla of Vater is a complex cellular environment from which adenocarcinomas arise to form a group of histopathologically heterogenous tumors. To evaluate the molecular features of these tumors, 98 ampullary adenocarcinomas were evaluated and compared to 44 distal bile duct and 18 duodenal adenocarcinomas. Genomic analyses revealed mutations in the WNT signaling pathway among half of the patients and in all three adenocarcinomas irrespective of their origin and histological morphology. These tumors were characterized by a high frequency of inactivating mutations of ELF3, a high rate of microsatellite instability, and common focal deletions and amplifications, suggesting common attributes in the molecular pathogenesis are at play in these tumors. The high frequency of WNT pathway activating mutation, coupled with small-molecule inhibitors of β-catenin in clinical trials, suggests future treatment decisions for these patients may be guided by genomic analysis
    • …
    corecore