77 research outputs found
Music classification by low-rank semantic mappings
A challenging open question in music classification is which music representation (i.e., audio features) and which machine learning algorithm is appropriate for a specific music classification task. To address this challenge, given a number of audio feature vectors for each training music recording that capture the different aspects of music (i.e., timbre, harmony, etc.), the goal is to find a set of linear mappings from several feature spaces to the semantic space spanned by the class indicator vectors. These mappings should reveal the common latent variables, which characterize a given set of classes and simultaneously define a multi-class linear classifier that classifies the extracted latent common features. Such a set of mappings is obtained, building on the notion of the maximum margin matrix factorization, by minimizing a weighted sum of nuclear norms. Since the nuclear norm imposes rank constraints to the learnt mappings, the proposed method is referred to as low-rank semantic mappings (LRSMs). The performance of the LRSMs in music genre, mood, and multi-label classification is assessed by conducting extensive experiments on seven manually annotated benchmark datasets. The reported experimental results demonstrate the superiority of the LRSMs over the classifiers that are compared to. Furthermore, the best reported classification results are comparable with or slightly superior to those obtained by the state-of-the-art task-specific music classification methods
Generalization analysis of an unfolding network for analysis-based Compressed Sensing
Unfolding networks have shown promising results in the Compressed Sensing
(CS) field. Yet, the investigation of their generalization ability is still in
its infancy. In this paper, we perform generalization analysis of a
state-of-the-art ADMM-based unfolding network, which jointly learns a decoder
for CS and a sparsifying redundant analysis operator. To this end, we first
impose a structural constraint on the learnable sparsifier, which parametrizes
the network's hypothesis class. For the latter, we estimate its Rademacher
complexity. With this estimate in hand, we deliver generalization error bounds
for the examined network. Finally, the validity of our theory is assessed and
numerical comparisons to a state-of-the-art unfolding network are made, on
synthetic and real-world datasets. Our experimental results demonstrate that
our proposed framework complies with our theoretical findings and outperforms
the baseline, consistently for all datasets
Informed Non-convex Robust Principal Component Analysis with Features
We revisit the problem of robust principal component analysis with features
acting as prior side information. To this aim, a novel, elegant, non-convex
optimization approach is proposed to decompose a given observation matrix into
a low-rank core and the corresponding sparse residual. Rigorous theoretical
analysis of the proposed algorithm results in exact recovery guarantees with
low computational complexity. Aptly designed synthetic experiments demonstrate
that our method is the first to wholly harness the power of non-convexity over
convexity in terms of both recoverability and speed. That is, the proposed
non-convex approach is more accurate and faster compared to the best available
algorithms for the problem under study. Two real-world applications, namely
image classification and face denoising further exemplify the practical
superiority of the proposed method
GAGAN: Geometry-Aware Generative Adversarial Networks
Deep generative models learned through adversarial training have become
increasingly popular for their ability to generate naturalistic image textures.
However, aside from their texture, the visual appearance of objects is
significantly influenced by their shape geometry; information which is not
taken into account by existing generative models. This paper introduces the
Geometry-Aware Generative Adversarial Networks (GAGAN) for incorporating
geometric information into the image generation process. Specifically, in GAGAN
the generator samples latent variables from the probability space of a
statistical shape model. By mapping the output of the generator to a canonical
coordinate frame through a differentiable geometric transformation, we enforce
the geometry of the objects and add an implicit connection from the prior to
the generated object. Experimental results on face generation indicate that the
GAGAN can generate realistic images of faces with arbitrary facial attributes
such as facial expression, pose, and morphology, that are of better quality
than current GAN-based methods. Our method can be used to augment any existing
GAN architecture and improve the quality of the images generated
DECONET: an Unfolding Network for Analysis-based Compressed Sensing with Generalization Error Bounds
We present a new deep unfolding network for analysis-sparsity-based
Compressed Sensing. The proposed network coined Decoding Network (DECONET)
jointly learns a decoder that reconstructs vectors from their incomplete, noisy
measurements and a redundant sparsifying analysis operator, which is shared
across the layers of DECONET. Moreover, we formulate the hypothesis class of
DECONET and estimate its associated Rademacher complexity. Then, we use this
estimate to deliver meaningful upper bounds for the generalization error of
DECONET. Finally, the validity of our theoretical results is assessed and
comparisons to state-of-the-art unfolding networks are made, on both synthetic
and real-world datasets. Experimental results indicate that our proposed
network outperforms the baselines, consistently for all datasets, and its
behaviour complies with our theoretical findings.Comment: Accepted in IEEE Transactions on Signal Processin
- …