793 research outputs found

    Adaptive Graph via Multiple Kernel Learning for Nonnegative Matrix Factorization

    Full text link
    Nonnegative Matrix Factorization (NMF) has been continuously evolving in several areas like pattern recognition and information retrieval methods. It factorizes a matrix into a product of 2 low-rank non-negative matrices that will define parts-based, and linear representation of nonnegative data. Recently, Graph regularized NMF (GrNMF) is proposed to find a compact representation,which uncovers the hidden semantics and simultaneously respects the intrinsic geometric structure. In GNMF, an affinity graph is constructed from the original data space to encode the geometrical information. In this paper, we propose a novel idea which engages a Multiple Kernel Learning approach into refining the graph structure that reflects the factorization of the matrix and the new data space. The GrNMF is improved by utilizing the graph refined by the kernel learning, and then a novel kernel learning method is introduced under the GrNMF framework. Our approach shows encouraging results of the proposed algorithm in comparison to the state-of-the-art clustering algorithms like NMF, GrNMF, SVD etc.Comment: This paper has been withdrawn by the author due to the terrible writin

    A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition

    Get PDF
    We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for improving speaker independence in the absence of supervision, and evaluate the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations. Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.5 page(s

    The intrinsic dimension of protein sequence evolution

    Get PDF
    It is well known that, in order to preserve its structure and function, a protein cannot change its sequence at random, but only by mutations occurring preferentially at specific locations. We here investigate quantitatively the amount of variability that is allowed in protein sequence evolution, by computing the intrinsic dimension (ID) of the sequences belonging to a selection of protein families. The ID is a measure of the number of independent directions that evolution can take starting from a given sequence. We find that the ID is practically constant for sequences belonging to the same family, and moreover it is very similar in different families, with values ranging between 6 and 12. These values are significantly smaller than the raw number of amino acids, confirming the importance of correlations between mutations in different sites. However, we demonstrate that correlations are not sufficient to explain the small value of the ID we observe in protein families. Indeed, we show that the ID of a set of protein sequences generated by maximum entropy models, an approach in which correlations are accounted for, is typically significantly larger than the value observed in natural protein families. We further prove that a critical factor to reproduce the natural ID is to take into consideration the phylogeny of sequences

    State of the Art in Face Recognition

    Get PDF
    Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state

    A Wasserstein distance-based spectral clustering method for transaction data analysis

    Full text link
    With the rapid development of online payment platforms, it is now possible to record massive transaction data. Clustering on transaction data significantly contributes to analyzing merchants' behavior patterns. This enables payment platforms to provide differentiated services or implement risk management strategies. However, traditional methods exploit transactions by generating low-dimensional features, leading to inevitable information loss. In this study, we use the empirical cumulative distribution of transactions to characterize merchants. We adopt Wasserstein distance to measure the dissimilarity between any two merchants and propose the Wasserstein-distance-based spectral clustering (WSC) approach. Based on the similarities between merchants' transaction distributions, a graph of merchants is generated. Thus, we treat the clustering of merchants as a graph-cut problem and solve it under the framework of spectral clustering. To ensure feasibility of the proposed method on large-scale datasets with limited computational resources, we propose a subsampling method for WSC (SubWSC). The associated theoretical properties are investigated to verify the efficiency of the proposed approach. The simulations and empirical study demonstrate that the proposed method outperforms feature-based methods in finding behavior patterns of merchants
    corecore