Search CORE

1,068 research outputs found

A Matlab Toolbox for Feature Importance Ranking

Author: Liang Xiaokun
Qin Wenjian
Wu Junjie
Xie Yaoqin
Yu Shaode
Zhang Erlei
Zhang Zhicheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/03/2020
Field of study

More attention is being paid for feature importance ranking (FIR), in particular when thousands of features can be extracted for intelligent diagnosis and personalized medicine. A large number of FIR approaches have been proposed, while few are integrated for comparison and real-life applications. In this study, a matlab toolbox is presented and a total of 30 algorithms are collected. Moreover, the toolbox is evaluated on a database of 163 ultrasound images. To each breast mass lesion, 15 features are extracted. To figure out the optimal subset of features for classification, all combinations of features are tested and linear support vector machine is used for the malignancy prediction of lesions annotated in ultrasound images. At last, the effectiveness of FIR is analyzed according to performance comparison. The toolbox is online (https://github.com/NicoYuCN/matFIR). In our future work, more FIR methods, feature selection methods and machine learning classifiers will be integrated

arXiv.org e-Print Archive

Crossref

Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality

Author: Melzi Simone
Roffo Giorgio
Publication venue
Publication date: 01/01/2017
Field of study

In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph-where features are the nodes-the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data and object recognition, among others), and compared against filter, embedded and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time.Comment: Preprint version - Lecture Notes in Computer Science - Springer 201

arXiv.org e-Print Archive

Enlighten

Infinite feature selection: a graph-based feature filtering approach

Author: Castellani Umberto
Cristani Marco
Melzi Simone
Roffo Giorgio
Vinciarelli Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

We propose a filtering feature selection framework that considers a subset of features as a path in a graph, where a node is a feature and an edge indicates pairwise (customizable) relations among features, dealing with relevance and redundancy principles. By two different interpretations (exploiting properties of power series of matrices and relying on Markov chains fundamentals) we can evaluate the values of paths (i.e., feature subsets) of arbitrary lengths, eventually go to infinite, from which we dub our framework Infinite Feature Selection (Inf-FS). Going to infinite allows to constrain the computational complexity of the selection process, and to rank the features in an elegant way, that is, considering the value of any path (subset) containing a particular feature. We also propose a simple unsupervised strategy to cut the ranking, so providing the subset of features to keep. In the experiments, we analyze diverse setups with heterogeneous features, for a total of 11 benchmarks, comparing against 18 widely-known yet effective comparative approaches. The results show that Inf-FS behaves better in almost any situation, that is, when the number of features to keep are fixed a priori, or when the decision of the subset cardinality is part of the process

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

Enlighten

Archivio della ricerca- Università di Roma La Sapienza