5,858 research outputs found
A deep matrix factorization method for learning attribute representations
Semi-Non-negative Matrix Factorization is a technique that learns a
low-dimensional representation of a dataset that lends itself to a clustering
interpretation. It is possible that the mapping between this new representation
and our original data matrix contains rather complex hierarchical information
with implicit lower-level hidden attributes, that classical one level
clustering methodologies can not interpret. In this work we propose a novel
model, Deep Semi-NMF, that is able to learn such hidden representations that
allow themselves to an interpretation of clustering according to different,
unknown attributes of a given dataset. We also present a semi-supervised
version of the algorithm, named Deep WSF, that allows the use of (partial)
prior information for each of the known attributes of a dataset, that allows
the model to be used on datasets with mixed attribute knowledge. Finally, we
show that our models are able to learn low-dimensional representations that are
better suited for clustering, but also classification, outperforming
Semi-Non-negative Matrix Factorization, but also other state-of-the-art
methodologies variants.Comment: Submitted to TPAMI (16-Mar-2015
Audio Event Detection using Weakly Labeled Data
Acoustic event detection is essential for content analysis and description of
multimedia recordings. The majority of current literature on the topic learns
the detectors through fully-supervised techniques employing strongly labeled
data. However, the labels available for majority of multimedia data are
generally weak and do not provide sufficient detail for such methods to be
employed. In this paper we propose a framework for learning acoustic event
detectors using only weakly labeled data. We first show that audio event
detection using weak labels can be formulated as an Multiple Instance Learning
problem. We then suggest two frameworks for solving multiple-instance learning,
one based on support vector machines, and the other on neural networks. The
proposed methods can help in removing the time consuming and expensive process
of manually annotating data to facilitate fully supervised learning. Moreover,
it can not only detect events in a recording but can also provide temporal
locations of events in the recording. This helps in obtaining a complete
description of the recording and is notable since temporal information was
never known in the first place in weakly labeled data.Comment: ACM Multimedia 201
Kernel Multivariate Analysis Framework for Supervised Subspace Learning: A Tutorial on Linear and Kernel Multivariate Methods
Feature extraction and dimensionality reduction are important tasks in many
fields of science dealing with signal processing and analysis. The relevance of
these techniques is increasing as current sensory devices are developed with
ever higher resolution, and problems involving multimodal data sources become
more common. A plethora of feature extraction methods are available in the
literature collectively grouped under the field of Multivariate Analysis (MVA).
This paper provides a uniform treatment of several methods: Principal Component
Analysis (PCA), Partial Least Squares (PLS), Canonical Correlation Analysis
(CCA) and Orthonormalized PLS (OPLS), as well as their non-linear extensions
derived by means of the theory of reproducing kernel Hilbert spaces. We also
review their connections to other methods for classification and statistical
dependence estimation, and introduce some recent developments to deal with the
extreme cases of large-scale and low-sized problems. To illustrate the wide
applicability of these methods in both classification and regression problems,
we analyze their performance in a benchmark of publicly available data sets,
and pay special attention to specific real applications involving audio
processing for music genre prediction and hyperspectral satellite images for
Earth and climate monitoring
Semi-supervised model-based clustering with controlled clusters leakage
In this paper, we focus on finding clusters in partially categorized data
sets. We propose a semi-supervised version of Gaussian mixture model, called
C3L, which retrieves natural subgroups of given categories. In contrast to
other semi-supervised models, C3L is parametrized by user-defined leakage
level, which controls maximal inconsistency between initial categorization and
resulting clustering. Our method can be implemented as a module in practical
expert systems to detect clusters, which combine expert knowledge with true
distribution of data. Moreover, it can be used for improving the results of
less flexible clustering techniques, such as projection pursuit clustering. The
paper presents extensive theoretical analysis of the model and fast algorithm
for its efficient optimization. Experimental results show that C3L finds high
quality clustering model, which can be applied in discovering meaningful groups
in partially classified data
Recommended from our members
Context-awareness for mobile sensing: a survey and future directions
The evolution of smartphones together with increasing computational power have empowered developers to create innovative context-aware applications for recognizing user related social and cognitive activities in any situation and at any location. The existence and awareness of the context provides the capability of being conscious of physical environments or situations around mobile device users. This allows network services to respond proactively and intelligently based on such awareness. The key idea behind context-aware applications is to encourage users to collect, analyze and share local sensory knowledge in the purpose for a large scale community use by creating a smart network. The desired network is capable of making autonomous logical decisions to actuate environmental objects, and also assist individuals. However, many open challenges remain, which are mostly arisen due to the middleware services provided in mobile devices have limited resources in terms of power, memory and bandwidth. Thus, it becomes critically important to study how the drawbacks can be elaborated and resolved, and at the same time better understand the opportunities for the research community to contribute to the context-awareness. To this end, this paper surveys the literature over the period of 1991-2014 from the emerging concepts to applications of context-awareness in mobile platforms by providing up-to-date research and future research directions. Moreover, it points out the challenges faced in this regard and enlighten them by proposing possible solutions
- …