361 research outputs found

    Non-negative mixtures

    Get PDF
    This is the author's accepted pre-print of the article, first published as M. D. Plumbley, A. Cichocki and R. Bro. Non-negative mixtures. In P. Comon and C. Jutten (Ed), Handbook of Blind Source Separation: Independent Component Analysis and Applications. Chapter 13, pp. 515-547. Academic Press, Feb 2010. ISBN 978-0-12-374726-6 DOI: 10.1016/B978-0-12-374726-6.00018-7file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.26file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.2

    BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent

    Get PDF
    Matrix tri-factorization subject to binary constraints is a versatile and powerful framework for the simultaneous clustering of observations and features, also known as biclustering. Applications for biclustering encompass the clustering of high-dimensional data and explorative data mining, where the selection of the most important features is relevant. Unfortunately, due to the lack of suitable methods for the optimization subject to binary constraints, the powerful framework of biclustering is typically constrained to clusterings which partition the set of observations or features. As a result, overlap between clusters cannot be modelled and every item, even outliers in the data, have to be assigned to exactly one cluster. In this paper we propose Broccoli, an optimization scheme for matrix factorization subject to binary constraints, which is based on the theoretically well-founded optimization scheme of proximal stochastic gradient descent. Thereby, we do not impose any restrictions on the obtained clusters. Our experimental evaluation, performed on both synthetic and real-world data, and against 6 competitor algorithms, show reliable and competitive performance, even in presence of a high amount of noise in the data. Moreover, a qualitative analysis of the identified clusters shows that Broccoli may provide meaningful and interpretable clustering structures

    Integration of molecular network data reconstructs Gene Ontology.

    Get PDF
    Motivation: Recently, a shift was made from using Gene Ontology (GO) to evaluate molecular network data to using these data to construct and evaluate GO. Dutkowski et al. provide the first evidence that a large part of GO can be reconstructed solely from topologies of molecular networks. Motivated by this work, we develop a novel data integration framework that integrates multiple types of molecular network data to reconstruct and update GO. We ask how much of GO can be recovered by integrating various molecular interaction data. Results: We introduce a computational framework for integration of various biological networks using penalized non-negative matrix tri-factorization (PNMTF). It takes all network data in a matrix form and performs simultaneous clustering of genes and GO terms, inducing new relations between genes and GO terms (annotations) and between GO terms themselves. To improve the accuracy of our predicted relations, we extend the integration methodology to include additional topological information represented as the similarity in wiring around non-interacting genes. Surprisingly, by integrating topologies of bakers’ yeasts protein–protein interaction, genetic interaction (GI) and co-expression networks, our method reports as related 96% of GO terms that are directly related in GO. The inclusion of the wiring similarity of non-interacting genes contributes 6% to this large GO term association capture. Furthermore, we use our method to infer new relationships between GO terms solely from the topologies of these networks and validate 44% of our predictions in the literature. In addition, our integration method reproduces 48% of cellular component, 41% of molecular function and 41% of biological process GO terms, outperforming the previous method in the former two domains of GO. Finally, we predict new GO annotations of yeast genes and validate our predictions through GIs profiling. Availability and implementation: Supplementary Tables of new GO term associations and predicted gene annotations are available at http://bio-nets.doc.ic.ac.uk/GO-Reconstruction/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online
    • …
    corecore