1,974 research outputs found
Recommended from our members
Non-Negative Tensor Factorization Applied to Music Genre Classification
Music genre classification techniques are typically applied to the data matrix whose columns are the feature vectors extracted from music recordings. In this paper, a feature vector is extracted using a texture window of one sec, which enables the representation of any 30 sec long music recording as a time sequence of feature vectors, thus yielding a feature matrix. Consequently, by stacking the feature matrices associated to any dataset recordings, a tensor is created, a fact which necessitates studying music genre classification using tensors. First, a novel algorithm for non-negative tensor factorization (NTF) is derived that extends the non-negative matrix factorization. Several variants of the NTF algorithm emerge by employing different cost functions from the class of Bregman divergences. Second, a novel supervised NTF classifier is proposed, which trains a basis for each class separately and employs basis orthogonalization. A variety of spectral, temporal, perceptual, energy, and pitch descriptors is extracted from 1000 recordings of the GTZAN dataset, which are distributed across 10 genre classes. The NTF classifier performance is compared against that of the multilayer perceptron and the support vector machines by applying a stratified 10-fold cross validation. A genre classification accuracy of 78.9% is reported for the NTF classifier demonstrating the superiority of the aforementioned multilinear classifier over several data matrix-based state-of-the-art classifiers
A Survey on Metric Learning for Feature Vectors and Structured Data
The need for appropriate ways to measure the distance or similarity between
data is ubiquitous in machine learning, pattern recognition and data mining,
but handcrafting such good metrics for specific problems is generally
difficult. This has led to the emergence of metric learning, which aims at
automatically learning a metric from data and has attracted a lot of interest
in machine learning and related fields for the past ten years. This survey
paper proposes a systematic review of the metric learning literature,
highlighting the pros and cons of each approach. We pay particular attention to
Mahalanobis distance metric learning, a well-studied and successful framework,
but additionally present a wide range of methods that have recently emerged as
powerful alternatives, including nonlinear metric learning, similarity learning
and local metric learning. Recent trends and extensions, such as
semi-supervised metric learning, metric learning for histogram data and the
derivation of generalization guarantees, are also covered. Finally, this survey
addresses metric learning for structured data, in particular edit distance
learning, and attempts to give an overview of the remaining challenges in
metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved
presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new
method
Sparse Transfer Learning for Interactive Video Search Reranking
Visual reranking is effective to improve the performance of the text-based
video search. However, existing reranking algorithms can only achieve limited
improvement because of the well-known semantic gap between low level visual
features and high level semantic concepts. In this paper, we adopt interactive
video search reranking to bridge the semantic gap by introducing user's
labeling effort. We propose a novel dimension reduction tool, termed sparse
transfer learning (STL), to effectively and efficiently encode user's labeling
information. STL is particularly designed for interactive video search
reranking. Technically, it a) considers the pair-wise discriminative
information to maximally separate labeled query relevant samples from labeled
query irrelevant ones, b) achieves a sparse representation for the subspace to
encodes user's intention by applying the elastic net penalty, and c) propagates
user's labeling information from labeled samples to unlabeled samples by using
the data distribution knowledge. We conducted extensive experiments on the
TRECVID 2005, 2006 and 2007 benchmark datasets and compared STL with popular
dimension reduction algorithms. We report superior performance by using the
proposed STL based interactive video search reranking.Comment: 17 page
Image patch analysis of sunspots and active regions. II. Clustering via matrix factorization
Separating active regions that are quiet from potentially eruptive ones is a
key issue in Space Weather applications. Traditional classification schemes
such as Mount Wilson and McIntosh have been effective in relating an active
region large scale magnetic configuration to its ability to produce eruptive
events. However, their qualitative nature prevents systematic studies of an
active region's evolution for example. We introduce a new clustering of active
regions that is based on the local geometry observed in Line of Sight
magnetogram and continuum images. We use a reduced-dimension representation of
an active region that is obtained by factoring the corresponding data matrix
comprised of local image patches. Two factorizations can be compared via the
definition of appropriate metrics on the resulting factors. The distances
obtained from these metrics are then used to cluster the active regions. We
find that these metrics result in natural clusterings of active regions. The
clusterings are related to large scale descriptors of an active region such as
its size, its local magnetic field distribution, and its complexity as measured
by the Mount Wilson classification scheme. We also find that including data
focused on the neutral line of an active region can result in an increased
correspondence between our clustering results and other active region
descriptors such as the Mount Wilson classifications and the value. We
provide some recommendations for which metrics, matrix factorization
techniques, and regions of interest to use to study active regions.Comment: Accepted for publication in the Journal of Space Weather and Space
Climate (SWSC). 33 pages, 12 figure
When autoencoders meet recommender systems : COFILS approach
Collaborative Filtering to Supervised Learning (COFILS) transforms a Collaborative Filtering (CF) problem into classical Supervised Learning (SL) problem. Applying COFILS reduce data sparsity and make it possible to test a variety of SL algorithms rather than matrix decomposition methods. It main steps are: extraction, mapping and prediction. Firstly, a Singular Value Decomposition (SVD) generates a set of latent variables from a ratings matrix. Next, on the mapping phase, a new data set is generated where each sample contains a set of latent variables from an user and it rated item; and a target that corresponds the user rating for that item. Finally, on the last phase, a SL algorithm is applied. One problem of COFILS is it’s dependency on SVD, that is not able to extract non-linear features from data and it is not robust to noisy data. To address this problem, we propose switching SVD to a Stacked Denoising Autoencoder (SDA) on the first phase of COFILS. With SDA, more useful and complex representations can be learned in a Deep Network with a local denoising criterion. We test our novel technique, namely Deep Learning COFILS (DL-COFILS), on MovieLens, R3 Yahoo! Music and Movie Tweetings data sets and compare to COFILS, as a baseline, and state of the art CF techniques. Our results indicate that DL-COFILS outperforms COFILS for all the data sets and with an improvement up to 5.9%. Also, DL-COFILS achieves the best result for the MovieLens 100k data set and ranks on the top three algorithms for these data sets. Thus, we show that DL-COFILS represents an advance on COFILS methodology, improving it’s results and that is a suitable method for CF problem.Collaborative Filtering to Supervised Learning (COFILS) transforma um problema de filtragem colaborativa (CF) em um problema clássico de aprendizado supervisionado (SL). Sua aplicação reduz a esparsidade e torna possÃvel a utilização de variados algoritmos de SL em oposição aos métodos de decomposição de matrizes. Primeiramente, a Decomposição em Valores Singulares (SVD) gera um conjunto de variáveis latentes a partir da matriz de avaliações. Na fase de mapeamento, um novo conjunto de dados é gerado, do qual cada amostra contém um conjunto de variáveis latentes de um usuário e do item avaliado; e um valor que corresponde a avaliação que o usuário atribuiu a esse item. Por fim, o algoritmo de SL é aplicado. Um ponto negativo do COFILS é sua dependência ao SVD, incapaz de extrair caracterÃsticas não-lineares e sem robustez `a dados ruidosos. Nesse caso, propomos a troca do SVD por um Stacked Denoising Autoencoder (SDA). Com o uso de um SDA, representações mais úteis e complexas podem ser aprendidas em uma rede neural profunda com um critério local de remoção de ruÃdo. Executamos nossa técnica, chamada Deep Learning COFILS (DL-COFILS), nos conjuntos de dados MovieLens, R3 Yahoo! Music e Movie Tweetings comparando os resultados com o COFILS padrão, como baseline, e demais técnicas de estado da arte de CF. Com os resultados obtidos, é possÃvel mencionar que DL-COFILS supera COFILS para todos os conjuntos de dados, com uma melhora de até 5.9%. Além disso, o DLCOFILS alcança o melhor resultado para o MovieLens 100k e se encontra entre os três melhores algoritmos nos demais conjuntos de dados. Dessa forma, mostraremos que DL-COFILS representa um avanço na metodologia COFILS, melhorando seus resultados e se mostrando um método adequado para CF
Generative-Discriminative Low Rank Decomposition for Medical Imaging Applications
In this thesis, we propose a method that can be used to extract biomarkers from medical images toward early diagnosis of abnormalities. Surge of demand for biomarkers and availability of medical images in the recent years call for accurate, repeatable, and interpretable approaches for extracting meaningful imaging features. However, extracting such information from medical images is a challenging task because the number of pixels (voxels) in a typical image is in order of millions while even a large sample-size in medical image dataset does not usually exceed a few hundred. Nevertheless, depending on the nature of an abnormality, only a parsimonious subset of voxels is typically relevant to the disease; therefore various notions of sparsity are exploited in this thesis to improve the generalization performance of the prediction task.
We propose a novel discriminative dimensionality reduction method that yields good classification performance on various datasets without compromising the clinical interpretability of the results. This is achieved by combining the modelling strength of generative learning framework and the classification performance of discriminative learning paradigm. Clinical interpretability can be viewed as an additional measure of evaluation and is also helpful in designing methods that account for the clinical prior such as association of certain areas in a brain to a particular cognitive task or connectivity of some brain regions via neural fibres.
We formulate our method as a large-scale optimization problem to solve a constrained matrix factorization. Finding an optimal solution of the large-scale matrix factorization renders off-the-shelf solver computationally prohibitive; therefore, we designed an efficient algorithm based on the proximal method to address the computational bottle-neck of the optimization problem. Our formulation is readily extended for different scenarios such as cases where a large cohort of subjects has uncertain or no class labels (semi-supervised learning) or a case where each subject has a battery of imaging channels (multi-channel), \etc. We show that by using various notions of sparsity as feasible sets of the optimization problem, we can encode different forms of prior knowledge ranging from brain parcellation to brain connectivity
- …