268 research outputs found

    Graph Regularized Nonnegative Matrix Factorization with Sparse Coding

    Get PDF
    In this paper, we propose a sparseness constraint NMF method, named graph regularized matrix factorization with sparse coding (GRNMF_SC). By combining manifold learning and sparse coding techniques together, GRNMF_SC can efficiently extract the basic vectors from the data space, which preserves the intrinsic manifold structure and also the local features of original data. The target function of our method is easy to propose, while the solving procedures are really nontrivial; in the paper we gave the detailed derivation of solving the target function and also a strict proof of its convergence, which is a key contribution of the paper. Compared with sparseness constrained NMF and GNMF algorithms, GRNMF_SC can learn much sparser representation of the data and can also preserve the geometrical structure of the data, which endow it with powerful discriminating ability. Furthermore, the GRNMF_SC is generalized as supervised and unsupervised models to meet different demands. Experimental results demonstrate encouraging results of GRNMF_SC on image recognition and clustering when comparing with the other state-of-the-art NMF methods

    Dictionary Learning-Based Speech Enhancement

    Get PDF

    Microbial community pattern detection in human body habitats via ensemble clustering framework

    Full text link
    The human habitat is a host where microbial species evolve, function, and continue to evolve. Elucidating how microbial communities respond to human habitats is a fundamental and critical task, as establishing baselines of human microbiome is essential in understanding its role in human disease and health. However, current studies usually overlook a complex and interconnected landscape of human microbiome and limit the ability in particular body habitats with learning models of specific criterion. Therefore, these methods could not capture the real-world underlying microbial patterns effectively. To obtain a comprehensive view, we propose a novel ensemble clustering framework to mine the structure of microbial community pattern on large-scale metagenomic data. Particularly, we first build a microbial similarity network via integrating 1920 metagenomic samples from three body habitats of healthy adults. Then a novel symmetric Nonnegative Matrix Factorization (NMF) based ensemble model is proposed and applied onto the network to detect clustering pattern. Extensive experiments are conducted to evaluate the effectiveness of our model on deriving microbial community with respect to body habitat and host gender. From clustering results, we observed that body habitat exhibits a strong bound but non-unique microbial structural patterns. Meanwhile, human microbiome reveals different degree of structural variations over body habitat and host gender. In summary, our ensemble clustering framework could efficiently explore integrated clustering results to accurately identify microbial communities, and provide a comprehensive view for a set of microbial communities. Such trends depict an integrated biography of microbial communities, which offer a new insight towards uncovering pathogenic model of human microbiome.Comment: BMC Systems Biology 201

    NON-MATRIX FACTORIZATION FOR BLIND IMAGE SEPARATION

    Get PDF
    Hyperspectral unmixing is a process to identify the constituent materials and estimate the corresponding fractions from the mixture, nonnegative matrix factions ( NMF ) is suitable as a candidate for the linear spectral mixture mode, has been applied to the unmixing hyperspectral data. Unfortunately, the local minima is cause by the nonconvexity of the objective functionĀ  makes the solution nonunique, thus only the nonnegativity constraint is not sufficient enough to lead to a well define problems. Therefore, two inherent characteristic of hyperspectal data, piecewise smoothness ( both temporal and spatial ) of spectral data and sparseness of abundance fraction of every material, are introduce to the NMF. The adaptive potential function from discontinuity adaptive Markov random field model is used to describe the smoothness constraint while preserving discontinuities is spectral data.Ā  At the same time two NMF algorithms, non smooth NMS and NMF with sparseness constraint, are used to quantify the degree of sparseness of material abundances. Experiment using the synthetic and real data demonstrate the proposed algorithms provides an effective unsupervised technique for hyperspectial unmixing

    Subspace Structure Regularized Nonnegative Matrix Factorization for Hyperspectral Unmixing

    Get PDF
    Hyperspectral unmixing is a crucial task for hyperspectral images (HSI) processing, which estimates the proportions of constituent materials of a mixed pixel. Usually, the mixed pixels can be approximated using a linear mixing model. Since each material only occurs in a few pixels in real HSI, sparse nonnegative matrix factorization (NMF) and its extensions are widely used as solutions. Some recent works assume that materials are distributed in certain structures, which can be added as constraints to sparse NMF model. However, they only consider the spatial distribution within a local neighborhood and define the distribution structure manually, while ignoring the real distribution of materials that is diverse in different images. In this paper, we propose a new unmixing method that learns a subspace structure from the original image and incorporate it into the sparse NMF framework to promote unmixing performance. Based on the self-representation property of data points lying in the same subspace, the learned subspace structure can indicate the global similar graph of pixels that represents the real distribution of materials. Then the similar graph is used as a robust global spatial prior which is expected to be maintained in the decomposed abundance matrix. The experiments conducted on both simulated and real-world HSI datasets demonstrate the superior performance of our proposed method

    Advances in nonnegative matrix factorization with application on data clustering.

    Get PDF
    Clustering is an important direction in many ļ¬elds, e.g., machine learning, data mining and computer vision. It aims to divide data into groups (clusters) for the purposes of summarization or improved understanding. With the rapid development of new technology, high-dimensional data become very common in many real world applications, such as satellite returned large number of images, robot received real-time video streaming, large-scale text database and the mass of information on the social networks (i.e., Facebook, twitter), etc, however, most existing clustering approaches are heavily restricted by the large number of features, and tend to be ineļ¬ƒcient and even infeasible. In this thesis, we focus on ļ¬nding an optimal low dimensional representation of high-dimensional data, based nonnegative matrix factorization (NMF) framework, for better clustering. Speciļ¬cally, there are three methods as follows: - Multiple Components Based Representation Learning Real data are usually complex and contain various components. For example, face images have expressions and genders. Each component mainly reļ¬‚ects one aspect of data and provides information others do not have. Therefore, exploring the semantic information of multiple components as well as the diversity among them is of great beneļ¬t to understand data comprehensively and in-depth. To this end, we propose a novel multi-component nonnegative matrix factorization. Instead of seeking for only one representation of data, our approach learns multiple representations simultaneously, with the help of the Hilbert Schmidt Independence Criterion (HSIC) as a diversity term. HSIC explores the diverse information among the representations, where each representation corresponds to a component. By integrating the multiple representations, a more comprehensive representation is then established. Extensive experimental results on real-world datasets have shown that MCNMF not only achieves more accurate performance over the state-of-the-arts using the aggregated representation, but also interprets data from diļ¬€erent aspects with the multiple representations, which is beyond what current NMFs can oļ¬€er. - Ordered Structure Preserving Representation Learning Real-world applications often process data, such as motion sequences and video clips, are with ordered structure, i.e., consecutive neighbouring data samples are very likely share similar features unless a sudden change occurs. Therefore, traditional NMF assumes the data samples and features to be independently distributed, making it not proper for the analysis of such data. To overcome this limitation, a novel NMF approach is proposed to take full advantage of the ordered nature embedded in the sequential data to improve the accuracy of data representation. With a L2,1-norm based neighbour penalty term, ORNMF enforces the similarity of neighbouring data. ORNMF also adopts the L2,1-norm based loss function to improve its robustness against noises and outliers. Moreover, ORNMF can ļ¬nd the cluster boundaries and get the number of clusters without the number of clusters to be given beforehand. A new iterative up- dating optimization algorithm is derived to solve ORNMFā€™s objective function. The proofs of the convergence and correctness of the scheme are also presented. Experiments on both synthetic and real-world datasets have demonstrated the eļ¬€ectiveness of ORNMF. - Diversity Enhanced Multi-view Representation Learning Multi-view learning aims to explore the correlations of diļ¬€erent information, such as diļ¬€erent features or modalities to boost the performance of data analysis. Multi-view data are very common in many real world applications because data is often collected from diverse domains or obtained from diļ¬€erent feature extractors. For example, color and texture information can be utilized as diļ¬€erent kinds of features in images and videos. Web pages are also able to be represented using the multi-view features based on text and hyperlinks. Taken alone, these views will often be deļ¬cient or incomplete because diļ¬€erent views describe distinct perspectives of data. Therefore, we propose a Diverse Multi-view NMF approach to explore diverse information among multi-view representations for more comprehensive learning. With a novel diversity regularization term, DiNMF explicitly enforces the orthogonality of diļ¬€erent data representations. Importantly, DiNMF converges linearly and scales well with large-scale data. By taking into account the manifold structures, we further extend the approach under a graph-based model to preserve the locally geometrical structure of the manifolds for multi-view setting. Compared to other multi-view NMF methods, the enhanced diversity of both approaches reduce the redundancy between the multi-view representations, and improve the accuracy of the clustering results. - Constrained Multi-View Representation Learning To incorporate prior information for learning accurately, we propose a novel semi- supervised multi-view NMF approach, which considers both the label constraints as well as the multi-view consistence simultaneously. In particular, the approach guarantees that data sharing the same label will have the same new representation and be mapped into the same class in the low-dimensional space regardless whether they come from the same view. Moreover, diļ¬€erent from current NMF- based multi-view clustering methods that require the weight factor of each view to be speciļ¬ed individually, we introduce a single parameter to control the distribution of weighting factors for NMF-based multi-view clustering. Consequently, the weight factor of each view can be assigned automatically depending on the dissimilarity between each new representation matrix and the consensus matrix. Besides, Using the structured sparsity-inducing, L2,1-norm, our method is robust against noises and hence can achieve more stable clustering results

    Outlier-Resilient Web Service QoS Prediction

    Get PDF
    The proliferation of Web services makes it difficult for users to select the most appropriate one among numerous functionally identical or similar service candidates. Quality-of-Service (QoS) describes the non-functional characteristics of Web services, and it has become the key differentiator for service selection. However, users cannot invoke all Web services to obtain the corresponding QoS values due to high time cost and huge resource overhead. Thus, it is essential to predict unknown QoS values. Although various QoS prediction methods have been proposed, few of them have taken outliers into consideration, which may dramatically degrade the prediction performance. To overcome this limitation, we propose an outlier-resilient QoS prediction method in this paper. Our method utilizes Cauchy loss to measure the discrepancy between the observed QoS values and the predicted ones. Owing to the robustness of Cauchy loss, our method is resilient to outliers. We further extend our method to provide time-aware QoS prediction results by taking the temporal information into consideration. Finally, we conduct extensive experiments on both static and dynamic datasets. The results demonstrate that our method is able to achieve better performance than state-of-the-art baseline methods.Comment: 12 pages, to appear at the Web Conference (WWW) 202
    • ā€¦
    corecore