1,077 research outputs found

    A deep matrix factorization method for learning attribute representations

    Get PDF
    Semi-Non-negative Matrix Factorization is a technique that learns a low-dimensional representation of a dataset that lends itself to a clustering interpretation. It is possible that the mapping between this new representation and our original data matrix contains rather complex hierarchical information with implicit lower-level hidden attributes, that classical one level clustering methodologies can not interpret. In this work we propose a novel model, Deep Semi-NMF, that is able to learn such hidden representations that allow themselves to an interpretation of clustering according to different, unknown attributes of a given dataset. We also present a semi-supervised version of the algorithm, named Deep WSF, that allows the use of (partial) prior information for each of the known attributes of a dataset, that allows the model to be used on datasets with mixed attribute knowledge. Finally, we show that our models are able to learn low-dimensional representations that are better suited for clustering, but also classification, outperforming Semi-Non-negative Matrix Factorization, but also other state-of-the-art methodologies variants.Comment: Submitted to TPAMI (16-Mar-2015

    Is Simple Better? Revisiting Non-linear Matrix Factorization for Learning Incomplete Ratings

    Full text link
    Matrix factorization techniques have been widely used as a method for collaborative filtering for recommender systems. In recent times, different variants of deep learning algorithms have been explored in this setting to improve the task of making a personalized recommendation with user-item interaction data. The idea that the mapping between the latent user or item factors and the original features is highly nonlinear suggest that classical matrix factorization techniques are no longer sufficient. In this paper, we propose a multilayer nonlinear semi-nonnegative matrix factorization method, with the motivation that user-item interactions can be modeled more accurately using a linear combination of non-linear item features. Firstly, we learn latent factors for representations of users and items from the designed multilayer nonlinear Semi-NMF approach using explicit ratings. Secondly, the architecture built is compared with deep-learning algorithms like Restricted Boltzmann Machine and state-of-the-art Deep Matrix factorization techniques. By using both supervised rate prediction task and unsupervised clustering in latent item space, we demonstrate that our proposed approach achieves better generalization ability in prediction as well as comparable representation ability as deep matrix factorization in the clustering task.Comment: version

    Modelling of Sound Events with Hidden Imbalances Based on Clustering and Separate Sub-Dictionary Learning

    Full text link
    This paper proposes an effective modelling of sound event spectra with a hidden data-size-imbalance, for improved Acoustic Event Detection (AED). The proposed method models each event as an aggregated representation of a few latent factors, while conventional approaches try to find acoustic elements directly from the event spectra. In the method, all the latent factors across all events are assigned comparable importance and complexity to overcome the hidden imbalance of data-sizes in event spectra. To extract latent factors in each event, the proposed method employs clustering and performs non-negative matrix factorization to each latent factor, and learns its acoustic elements as a sub-dictionary. Separate sub-dictionary learning effectively models the acoustic elements with limited data-sizes and avoids over-fitting due to hidden imbalances in training data. For the task of polyphonic sound event detection from DCASE 2013 challenge, an AED based on the proposed modelling achieves a detection F-measure of 46.5%, a significant improvement of more than 19% as compared to the existing state-of-the-art methods

    Collaborative Deep Learning for Speech Enhancement: A Run-Time Model Selection Method Using Autoencoders

    Full text link
    We show that a Modular Neural Network (MNN) can combine various speech enhancement modules, each of which is a Deep Neural Network (DNN) specialized on a particular enhancement job. Differently from an ordinary ensemble technique that averages variations in models, the propose MNN selects the best module for the unseen test signal to produce a greedy ensemble. We see this as Collaborative Deep Learning (CDL), because it can reuse various already-trained DNN models without any further refining. In the proposed MNN selecting the best module during run time is challenging. To this end, we employ a speech AutoEncoder (AE) as an arbitrator, whose input and output are trained to be as similar as possible if its input is clean speech. Therefore, the AE can gauge the quality of the module-specific denoised result by seeing its AE reconstruction error, e.g. low error means that the module output is similar to clean speech. We propose an MNN structure with various modules that are specialized on dealing with a specific noise type, gender, and input Signal-to-Noise Ratio (SNR) value, and empirically prove that it almost always works better than an arbitrarily chosen DNN module and sometimes as good as an oracle result
    • …
    corecore