327 research outputs found
A deep matrix factorization method for learning attribute representations
Semi-Non-negative Matrix Factorization is a technique that learns a
low-dimensional representation of a dataset that lends itself to a clustering
interpretation. It is possible that the mapping between this new representation
and our original data matrix contains rather complex hierarchical information
with implicit lower-level hidden attributes, that classical one level
clustering methodologies can not interpret. In this work we propose a novel
model, Deep Semi-NMF, that is able to learn such hidden representations that
allow themselves to an interpretation of clustering according to different,
unknown attributes of a given dataset. We also present a semi-supervised
version of the algorithm, named Deep WSF, that allows the use of (partial)
prior information for each of the known attributes of a dataset, that allows
the model to be used on datasets with mixed attribute knowledge. Finally, we
show that our models are able to learn low-dimensional representations that are
better suited for clustering, but also classification, outperforming
Semi-Non-negative Matrix Factorization, but also other state-of-the-art
methodologies variants.Comment: Submitted to TPAMI (16-Mar-2015
Facial expression recognition using shape and texture information
A novel method based on shape and texture information is proposed in this paper for facial expression recognition from video sequences. The Discriminant Non-negative Matrix Factorization (DNMF) algorithm is applied at the image corresponding to the greatest intensity of the facial expression (last frame of the video sequence), extracting that way the texture information. A Support Vector Machines (SVMs) system is used for the classi cation of the shape information derived from tracking the Candide grid over the video sequence. The shape information consists of the di erences of the node coordinates between the rst (neutral) and last (fully expressed facial expression) video frame. Subsequently, fusion of texture and shape information obtained is performed using Radial Basis Function (RBF) Neural Networks (NNs). The accuracy achieved is equal to 98,2% when recognizing the six basic facial expressionsIFIP International Conference on Artificial Intelligence in Theory and Practice - Machine VisionRed de Universidades con Carreras en Informática (RedUNCI
Modelling of Sound Events with Hidden Imbalances Based on Clustering and Separate Sub-Dictionary Learning
This paper proposes an effective modelling of sound event spectra with a
hidden data-size-imbalance, for improved Acoustic Event Detection (AED). The
proposed method models each event as an aggregated representation of a few
latent factors, while conventional approaches try to find acoustic elements
directly from the event spectra. In the method, all the latent factors across
all events are assigned comparable importance and complexity to overcome the
hidden imbalance of data-sizes in event spectra. To extract latent factors in
each event, the proposed method employs clustering and performs non-negative
matrix factorization to each latent factor, and learns its acoustic elements as
a sub-dictionary. Separate sub-dictionary learning effectively models the
acoustic elements with limited data-sizes and avoids over-fitting due to hidden
imbalances in training data. For the task of polyphonic sound event detection
from DCASE 2013 challenge, an AED based on the proposed modelling achieves a
detection F-measure of 46.5%, a significant improvement of more than 19% as
compared to the existing state-of-the-art methods
Relation among images: Modelling, optimization and applications
In the last two decades, the increasing popularity of information technology has led to a dramatic increase in the amount of visual data. Many applications are developed by processing, analyzing and understanding such increasing data; and modelling the relation among images is fundamental to success of many of them. Examples include image classification, content-based image retrieval and face recognition. Given signatures of images, there are many ways to depict the relation among them, such as pairwise distance, kernel function and factor analysis. However, existing methods are still insufficient as they suffer from many real factors such as misalignment of images and inefficiency from nonlinearity. This dissertation focuses on improving the relation modelling, its applications and related optimization. In particular, three aspects of relation modelling are addressed: 1. Integrate image alignment into the relation modelling methods, including image classification and factor analysis, to achieve stability in real applications. 2. Model relation when images are on multiple manifolds. 3. Develop nonlinear relation modelling methods, including tapering kernels for sparsification of kernel-based relation models and developing piecewise linear factor analysis to enjoy both the efficiency of linear models and the flexibility of nonlinear ones. We also discuss future directions of relation modelling in the last chapter from both application and methodology aspects
- …