219 research outputs found

    Four algorithms to solve symmetric multi-type non-negative matrix tri-factorization problem

    Get PDF
    In this paper, we consider the symmetric multi-type non-negative matrix tri-factorization problem (SNMTF), which attempts to factorize several symmetric non-negative matrices simultaneously. This can be considered as a generalization of the classical non-negative matrix tri-factorization problem and includes a non-convex objective function which is a multivariate sixth degree polynomial and a has convex feasibility set. It has a special importance in data science, since it serves as a mathematical model for the fusion of different data sources in data clustering. We develop four methods to solve the SNMTF. They are based on four theoretical approaches known from the literature: the fixed point method (FPM), the block-coordinate descent with projected gradient (BCD), the gradient method with exact line search (GM-ELS) and the adaptive moment estimation method (ADAM). For each of these methods we offer a software implementation: for the former two methods we use Matlab and for the latter Python with the TensorFlow library. We test these methods on three data-sets: the synthetic data-set we generated, while the others represent real-life similarities between different objects. Extensive numerical results show that with sufficient computing time all four methods perform satisfactorily and ADAM most often yields the best mean square error (MSE\mathrm{MSE}). However, if the computation time is limited, FPM gives the best MSE\mathrm{MSE} because it shows the fastest convergence at the beginning. All data-sets and codes are publicly available on our GitLab profile

    Advances in nonnegative matrix factorization with application on data clustering.

    Get PDF
    Clustering is an important direction in many ļ¬elds, e.g., machine learning, data mining and computer vision. It aims to divide data into groups (clusters) for the purposes of summarization or improved understanding. With the rapid development of new technology, high-dimensional data become very common in many real world applications, such as satellite returned large number of images, robot received real-time video streaming, large-scale text database and the mass of information on the social networks (i.e., Facebook, twitter), etc, however, most existing clustering approaches are heavily restricted by the large number of features, and tend to be ineļ¬ƒcient and even infeasible. In this thesis, we focus on ļ¬nding an optimal low dimensional representation of high-dimensional data, based nonnegative matrix factorization (NMF) framework, for better clustering. Speciļ¬cally, there are three methods as follows: - Multiple Components Based Representation Learning Real data are usually complex and contain various components. For example, face images have expressions and genders. Each component mainly reļ¬‚ects one aspect of data and provides information others do not have. Therefore, exploring the semantic information of multiple components as well as the diversity among them is of great beneļ¬t to understand data comprehensively and in-depth. To this end, we propose a novel multi-component nonnegative matrix factorization. Instead of seeking for only one representation of data, our approach learns multiple representations simultaneously, with the help of the Hilbert Schmidt Independence Criterion (HSIC) as a diversity term. HSIC explores the diverse information among the representations, where each representation corresponds to a component. By integrating the multiple representations, a more comprehensive representation is then established. Extensive experimental results on real-world datasets have shown that MCNMF not only achieves more accurate performance over the state-of-the-arts using the aggregated representation, but also interprets data from diļ¬€erent aspects with the multiple representations, which is beyond what current NMFs can oļ¬€er. - Ordered Structure Preserving Representation Learning Real-world applications often process data, such as motion sequences and video clips, are with ordered structure, i.e., consecutive neighbouring data samples are very likely share similar features unless a sudden change occurs. Therefore, traditional NMF assumes the data samples and features to be independently distributed, making it not proper for the analysis of such data. To overcome this limitation, a novel NMF approach is proposed to take full advantage of the ordered nature embedded in the sequential data to improve the accuracy of data representation. With a L2,1-norm based neighbour penalty term, ORNMF enforces the similarity of neighbouring data. ORNMF also adopts the L2,1-norm based loss function to improve its robustness against noises and outliers. Moreover, ORNMF can ļ¬nd the cluster boundaries and get the number of clusters without the number of clusters to be given beforehand. A new iterative up- dating optimization algorithm is derived to solve ORNMFā€™s objective function. The proofs of the convergence and correctness of the scheme are also presented. Experiments on both synthetic and real-world datasets have demonstrated the eļ¬€ectiveness of ORNMF. - Diversity Enhanced Multi-view Representation Learning Multi-view learning aims to explore the correlations of diļ¬€erent information, such as diļ¬€erent features or modalities to boost the performance of data analysis. Multi-view data are very common in many real world applications because data is often collected from diverse domains or obtained from diļ¬€erent feature extractors. For example, color and texture information can be utilized as diļ¬€erent kinds of features in images and videos. Web pages are also able to be represented using the multi-view features based on text and hyperlinks. Taken alone, these views will often be deļ¬cient or incomplete because diļ¬€erent views describe distinct perspectives of data. Therefore, we propose a Diverse Multi-view NMF approach to explore diverse information among multi-view representations for more comprehensive learning. With a novel diversity regularization term, DiNMF explicitly enforces the orthogonality of diļ¬€erent data representations. Importantly, DiNMF converges linearly and scales well with large-scale data. By taking into account the manifold structures, we further extend the approach under a graph-based model to preserve the locally geometrical structure of the manifolds for multi-view setting. Compared to other multi-view NMF methods, the enhanced diversity of both approaches reduce the redundancy between the multi-view representations, and improve the accuracy of the clustering results. - Constrained Multi-View Representation Learning To incorporate prior information for learning accurately, we propose a novel semi- supervised multi-view NMF approach, which considers both the label constraints as well as the multi-view consistence simultaneously. In particular, the approach guarantees that data sharing the same label will have the same new representation and be mapped into the same class in the low-dimensional space regardless whether they come from the same view. Moreover, diļ¬€erent from current NMF- based multi-view clustering methods that require the weight factor of each view to be speciļ¬ed individually, we introduce a single parameter to control the distribution of weighting factors for NMF-based multi-view clustering. Consequently, the weight factor of each view can be assigned automatically depending on the dissimilarity between each new representation matrix and the consensus matrix. Besides, Using the structured sparsity-inducing, L2,1-norm, our method is robust against noises and hence can achieve more stable clustering results

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Learning with Attributed Networks: Algorithms and Applications

    Get PDF
    abstract: Attributes - that delineating the properties of data, and connections - that describing the dependencies of data, are two essential components to characterize most real-world phenomena. The synergy between these two principal elements renders a unique data representation - the attributed networks. In many cases, people are inundated with vast amounts of data that can be structured into attributed networks, and their use has been attractive to researchers and practitioners in different disciplines. For example, in social media, users interact with each other and also post personalized content; in scientific collaboration, researchers cooperate and are distinct from peers by their unique research interests; in complex diseases studies, rich gene expression complements to the gene-regulatory networks. Clearly, attributed networks are ubiquitous and form a critical component of modern information infrastructure. To gain deep insights from such networks, it requires a fundamental understanding of their unique characteristics and be aware of the related computational challenges. My dissertation research aims to develop a suite of novel learning algorithms to understand, characterize, and gain actionable insights from attributed networks, to benefit high-impact real-world applications. In the first part of this dissertation, I mainly focus on developing learning algorithms for attributed networks in a static environment at two different levels: (i) attribute level - by designing feature selection algorithms to find high-quality features that are tightly correlated with the network topology; and (ii) node level - by presenting network embedding algorithms to learn discriminative node embeddings by preserving node proximity w.r.t. network topology structure and node attribute similarity. As changes are essential components of attributed networks and the results of learning algorithms will become stale over time, in the second part of this dissertation, I propose a family of online algorithms for attributed networks in a dynamic environment to continuously update the learning results on the fly. In fact, developing application-aware learning algorithms is more desired with a clear understanding of the application domains and their unique intents. As such, in the third part of this dissertation, I am also committed to advancing real-world applications on attributed networks by incorporating the objectives of external tasks into the learning process.Dissertation/ThesisDoctoral Dissertation Computer Science 201
    • ā€¦
    corecore