327 research outputs found

    A deep matrix factorization method for learning attribute representations

    Get PDF
    Semi-Non-negative Matrix Factorization is a technique that learns a low-dimensional representation of a dataset that lends itself to a clustering interpretation. It is possible that the mapping between this new representation and our original data matrix contains rather complex hierarchical information with implicit lower-level hidden attributes, that classical one level clustering methodologies can not interpret. In this work we propose a novel model, Deep Semi-NMF, that is able to learn such hidden representations that allow themselves to an interpretation of clustering according to different, unknown attributes of a given dataset. We also present a semi-supervised version of the algorithm, named Deep WSF, that allows the use of (partial) prior information for each of the known attributes of a dataset, that allows the model to be used on datasets with mixed attribute knowledge. Finally, we show that our models are able to learn low-dimensional representations that are better suited for clustering, but also classification, outperforming Semi-Non-negative Matrix Factorization, but also other state-of-the-art methodologies variants.Comment: Submitted to TPAMI (16-Mar-2015

    Facial expression recognition using shape and texture information

    Get PDF
    A novel method based on shape and texture information is proposed in this paper for facial expression recognition from video sequences. The Discriminant Non-negative Matrix Factorization (DNMF) algorithm is applied at the image corresponding to the greatest intensity of the facial expression (last frame of the video sequence), extracting that way the texture information. A Support Vector Machines (SVMs) system is used for the classi cation of the shape information derived from tracking the Candide grid over the video sequence. The shape information consists of the di erences of the node coordinates between the rst (neutral) and last (fully expressed facial expression) video frame. Subsequently, fusion of texture and shape information obtained is performed using Radial Basis Function (RBF) Neural Networks (NNs). The accuracy achieved is equal to 98,2% when recognizing the six basic facial expressionsIFIP International Conference on Artificial Intelligence in Theory and Practice - Machine VisionRed de Universidades con Carreras en Informática (RedUNCI

    Modelling of Sound Events with Hidden Imbalances Based on Clustering and Separate Sub-Dictionary Learning

    Full text link
    This paper proposes an effective modelling of sound event spectra with a hidden data-size-imbalance, for improved Acoustic Event Detection (AED). The proposed method models each event as an aggregated representation of a few latent factors, while conventional approaches try to find acoustic elements directly from the event spectra. In the method, all the latent factors across all events are assigned comparable importance and complexity to overcome the hidden imbalance of data-sizes in event spectra. To extract latent factors in each event, the proposed method employs clustering and performs non-negative matrix factorization to each latent factor, and learns its acoustic elements as a sub-dictionary. Separate sub-dictionary learning effectively models the acoustic elements with limited data-sizes and avoids over-fitting due to hidden imbalances in training data. For the task of polyphonic sound event detection from DCASE 2013 challenge, an AED based on the proposed modelling achieves a detection F-measure of 46.5%, a significant improvement of more than 19% as compared to the existing state-of-the-art methods

    Relation among images: Modelling, optimization and applications

    Get PDF
    In the last two decades, the increasing popularity of information technology has led to a dramatic increase in the amount of visual data. Many applications are developed by processing, analyzing and understanding such increasing data; and modelling the relation among images is fundamental to success of many of them. Examples include image classification, content-based image retrieval and face recognition. Given signatures of images, there are many ways to depict the relation among them, such as pairwise distance, kernel function and factor analysis. However, existing methods are still insufficient as they suffer from many real factors such as misalignment of images and inefficiency from nonlinearity. This dissertation focuses on improving the relation modelling, its applications and related optimization. In particular, three aspects of relation modelling are addressed: 1. Integrate image alignment into the relation modelling methods, including image classification and factor analysis, to achieve stability in real applications. 2. Model relation when images are on multiple manifolds. 3. Develop nonlinear relation modelling methods, including tapering kernels for sparsification of kernel-based relation models and developing piecewise linear factor analysis to enjoy both the efficiency of linear models and the flexibility of nonlinear ones. We also discuss future directions of relation modelling in the last chapter from both application and methodology aspects
    • …
    corecore