Search CORE

82,618 research outputs found

Supervised Feature Space Reduction for Multi-Label Nearest Neighbors

Author: Alami Réda
Kuntz Pascale
Meyer Frank
Siblini Wissam
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 27/06/2017
Field of study

International audienceWith the ability to process many real-world problems, multi-label classification has received a large attention in recent years and the instance-based ML-kNN classifier is today considered as one of the most efficient. But it is sensitive to noisy and redundant features and its performances decrease with increasing data dimensionality. To overcome these problems, dimensionality reduction is an alternative but current methods optimize reduction objectives which ignore the impact on the ML-kNN classification. We here propose ML-ARP, a novel dimensionality reduction algorithm which, using a variable neighborhood search meta-heuristic, learns a linear projection of the feature space which specifically optimizes the ML-kNN classification loss. Numerical comparisons have confirmed that ML-ARP outperforms ML-kNN without data processing and four standard multi-label dimensionality reduction algorithms

Latent Fisher Discriminant Analysis

Author: Chen Gang
Publication venue
Publication date: 20/09/2013
Field of study

Linear Discriminant Analysis (LDA) is a well-known method for dimensionality reduction and classification. Previous studies have also extended the binary-class case into multi-classes. However, many applications, such as object detection and keyframe extraction cannot provide consistent instance-label pairs, while LDA requires labels on instance level for training. Thus it cannot be directly applied for semi-supervised classification problem. In this paper, we overcome this limitation and propose a latent variable Fisher discriminant analysis model. We relax the instance-level labeling into bag-level, is a kind of semi-supervised (video-level labels of event type are required for semantic frame extraction) and incorporates a data-driven prior over the latent variables. Hence, our method combines the latent variable inference and dimension reduction in an unified bayesian framework. We test our method on MUSK and Corel data sets and yield competitive results compared to the baseline approach. We also demonstrate its capacity on the challenging TRECVID MED11 dataset for semantic keyframe extraction and conduct a human-factors ranking-based experimental evaluation, which clearly demonstrates our proposed method consistently extracts more semantically meaningful keyframes than challenging baselines.Comment: 12 page

arXiv.org e-Print Archive

CiteSeerX

Visual assessment of multi-photon interference

Author: Flamini Fulvio
Sciarrino Fabio
Spagnolo Nicolò
Publication venue: 'IOP Publishing'
Publication date: 01/01/2019
Field of study

Classical machine learning algorithms can provide insights on high-dimensional processes that are hardly accessible with conventional approaches. As a notable example, t-distributed Stochastic Neighbor Embedding (t-SNE) represents the state of the art for visualization of data sets of large dimensionality. An interesting question is then if this algorithm can provide useful information also in quantum experiments with very large Hilbert spaces. Leveraging these considerations, in this work we apply t-SNE to probe the spatial distribution of n-photon events in m-dimensional Hilbert spaces, showing that its findings can be beneficial for validating genuine quantum interference in boson sampling experiments. In particular, we find that nonlinear dimensionality reduction is capable to capture distinctive features in the spatial distribution of data related to multi-photon states with different evolutions. We envisage that this approach will inspire further theoretical investigations, for instance for a reliable assessment of quantum computational advantage

Archivio della ricerca- Università di Roma La Sapienza

ECG biometric authentication based on non-fiducial approach using kernel methods

Author: Abdul Aziz Ahmad Fazli
Hashim Shaiful Jahari
Hejazi Maryamsadat
Singh Yashwant Prasad
Syed Mohamed Syed Abdul Rahman Al-Haddad
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Identity recognition faces several challenges especially in extracting an individual's unique features from biometric modalities and pattern classifications. Electrocardiogram (ECG) waveforms, for instance, have unique identity properties for human recognition, and their signals are not periodic. At present, in order to generate a significant ECG feature set, non-fiducial methodologies based on an autocorrelation (AC) in conjunction with linear dimension reduction methods are used. This paper proposes a new non-fiducial framework for ECG biometric verification using kernel methods to reduce both high autocorrelation vectors' dimensionality and recognition system after denoising signals of 52 subjects with Discrete Wavelet Transform (DWT). The effects of different dimensionality reduction techniques for use in feature extraction were investigated to evaluate verification performance rates of a multi-class Support Vector Machine (SVM) with the One-Against-All (OAA) approach. The experimental results demonstrated higher test recognition rates of Gaussian OAA SVMs on random unknown ECG data sets with the use of the Kernel Principal Component Analysis (KPCA) as compared to the use of the Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA)

Universiti Putra Malaysia Institutional Repository

Clustered multidimensional scaling with Rulkov neurons

Author: Albert Carlo
Held Jenny
Ott Thomas
Schüle Martin
Stoop Ruedi
Publication venue: IEICE
Publication date: 01/01/2016
Field of study

Copyright ©2016 IEICEWhen dealing with high-dimensional measurements that often show non-linear characteristics at multiple scales, a need for unbiased and robust classification and interpretation techniques has emerged. Here, we present a method for mapping high-dimensional data onto low-dimensional spaces, allowing for a fast visual interpretation of the data. Classical approaches of dimensionality reduction attempt to preserve the geometry of the data. They often fail to correctly grasp cluster structures, for instance in high-dimensional situations, where distances between data points tend to become more similar. In order to cope with this clustering problem, we propose to combine classical multi-dimensional scaling with data clustering based on self-organization processes in neural networks, where the goal is to amplify rather than preserve local cluster structures. We find that applying dimensionality reduction techniques to the output of neural network based clustering not only allows for a convenient visual inspection, but also leads to further insights into the intraand inter-cluster connectivity. We report on an implementation of the method with Rulkov-Hebbian-learning clustering and illustrate its suitability in comparison to traditional methods by means of an artificial dataset and a real world example

ZHAW digitalcollection

GA-based feature subset selection in a spam/non-spam detection system

Author: Behjat Amir Rajabi
Mustapha Aida
Mustapha Norwati
Nezamabadi-pour Hossein
Sulaiman Md. Nasir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Spam has created a significant security problem for computer users everywhere. Spammers take an advantage of defrauds to cover parts of messages that can be used for identification of spam. For instance, a spammer does not need to consume much cost and bandwidth for sending junk mails even more than one hundred emails. On the other hand, from the feature selection perspective, one of the specific problems that decrease accuracy of spam and non-spam emails classification is high data dimensionality. Therefore, the reduction of dimensionality is related to decrease the number of irrelevant features. In this paper, a genetic algorithm (GA) is applied during feature selection in effort to decrease the number of useless features in a collection of high-dimensional email body and subject. Next, a Multi-Layer Perceptron (MLP) is employed to classify features that have been selected by the GA. Using LingSpam benchmark corpora as the dataset, the experimental results showed that a GA feature selector with the MLP classifier does not only decrease the data dimensionality but increase the spam detection rate as compared against other classifiers such as SVM and Naïve Bayes

Universiti Putra Malaysia Institutional Repository