Search CORE

105 research outputs found

Infinite Feature Selection on Shore-Based Biomarkers Reveals Connectivity Modulation after Stroke

Author: Granziera C.
Menegaz G.
Obertino S.
Roffo G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Connectomics is gaining increasing interest in the scientific and clinical communities. It consists in deriving models of structural or functional brain connections based on some local measures. Here we focus on structural connectivity as detected by diffusion MRI. Connectivity matrices are derived from microstructural indices obtained by the 3D-SHORE. Typically, graphs are derived from connectivity matrices and used for inferring node properties that allow identifying those nodes that play a prominent role in the network. This information can then be used to detect network modulations induced by diseases. In this paper we take a complementary approach and focus on link as opposed to node properties. We hypothesize that network modulation can be better described by measuring the connectivity alteration directly in the form of modulation of the properties of white matter fiber bundles constituting the network communication backbone. The goal of this paper is to detect the paths that are most altered by the pathology by exploiting a feature selection paradigm. Temporal changes on connection weights are treated as features and those playing a leading role in a patient versus healthy controls classification task are detected by the Infinite Feature Selection (Inf-FS) method. Results show that connection paths with high discriminative power can be identified that are shared by the considered microstructural descriptors allowing a classification accuracy ranging between 83% and 89%

Crossref

Catalogo dei prodotti della ricerca

Enlighten

Infinite feature selection: a graph-based feature filtering approach

Author: Castellani Umberto
Cristani Marco
Melzi Simone
Roffo Giorgio
Vinciarelli Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

We propose a filtering feature selection framework that considers a subset of features as a path in a graph, where a node is a feature and an edge indicates pairwise (customizable) relations among features, dealing with relevance and redundancy principles. By two different interpretations (exploiting properties of power series of matrices and relying on Markov chains fundamentals) we can evaluate the values of paths (i.e., feature subsets) of arbitrary lengths, eventually go to infinite, from which we dub our framework Infinite Feature Selection (Inf-FS). Going to infinite allows to constrain the computational complexity of the selection process, and to rank the features in an elegant way, that is, considering the value of any path (subset) containing a particular feature. We also propose a simple unsupervised strategy to cut the ranking, so providing the subset of features to keep. In the experiments, we analyze diverse setups with heterogeneous features, for a total of 11 benchmarks, comparing against 18 widely-known yet effective comparative approaches. The results show that Inf-FS behaves better in almost any situation, that is, when the number of features to keep are fixed a priori, or when the decision of the subset cardinality is part of the process

arXiv.org e-Print Archive

Catalogo dei prodotti della ricerca

Enlighten

Archivio della ricerca- Università di Roma La Sapienza

An Effective Feature Selection Method Based on Pair-Wise Feature Proximity for High Dimensional Low Sample Size Data

Author: bradley
duda
gu
he
hsu
liu
luo
nie
tang
yu
zaffalon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/08/2017
Field of study

Feature selection has been studied widely in the literature. However, the efficacy of the selection criteria for low sample size applications is neglected in most cases. Most of the existing feature selection criteria are based on the sample similarity. However, the distance measures become insignificant for high dimensional low sample size (HDLSS) data. Moreover, the variance of a feature with a few samples is pointless unless it represents the data distribution efficiently. Instead of looking at the samples in groups, we evaluate their efficiency based on pairwise fashion. In our investigation, we noticed that considering a pair of samples at a time and selecting the features that bring them closer or put them far away is a better choice for feature selection. Experimental results on benchmark data sets demonstrate the effectiveness of the proposed method with low sample size, which outperforms many other state-of-the-art feature selection methods.Comment: European Signal Processing Conference 201

arXiv.org e-Print Archive

Crossref

Personality in Computational Advertising: A Benchmark

Author: Roffo Giorgio
Vinciarelli Alessandro
Publication venue
Publication date: 01/01/2016
Field of study

In the last decade, new ways of shopping online have increased the possibility of buying products and services more easily and faster than ever. In this new context, personality is a key determinant in the decision making of the consumer when shopping. A person’s buying choices are influenced by psychological factors like impulsiveness; indeed some consumers may be more susceptible to making impulse purchases than others. Since affective metadata are more closely related to the user’s experience than generic parameters, accurate predictions reveal important aspects of user’s attitudes, social life, including attitude of others and social identity. This work proposes a highly innovative research that uses a personality perspective to determine the unique associations among the consumer’s buying tendency and advert recommendations. In fact, the lack of a publicly available benchmark for computational advertising do not allow both the exploration of this intriguing research direction and the evaluation of recent algorithms. We present the ADS Dataset, a publicly available benchmark consisting of 300 real advertisements (i.e., Rich Media Ads, Image Ads, Text Ads) rated by 120 unacquainted individuals, enriched with Big-Five users’ personality factors and 1,200 personal users’ pictures

Enlighten

Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality

Author: Melzi Simone
Roffo Giorgio
Publication venue
Publication date: 01/01/2017
Field of study

In an era where accumulating data is easy and storing it inexpensive, feature selection plays a central role in helping to reduce the high-dimensionality of huge amounts of otherwise meaningless data. In this paper, we propose a graph-based method for feature selection that ranks features by identifying the most important ones into arbitrary set of cues. Mapping the problem on an affinity graph-where features are the nodes-the solution is given by assessing the importance of nodes through some indicators of centrality, in particular, the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance of a feature as a function of the importance of its neighbors. Ranking central nodes individuates candidate features, which turn out to be effective from a classification point of view, as proved by a thoroughly experimental section. Our approach has been tested on 7 diverse datasets from recent literature (e.g., biological data and object recognition, among others), and compared against filter, embedded and wrappers methods. The results are remarkable in terms of accuracy, stability and low execution time.Comment: Preprint version - Lecture Notes in Computer Science - Springer 201

arXiv.org e-Print Archive

Enlighten

Infinite Latent Feature Selection: A Probabilistic Latent Graph-Based Ranking Approach

Author: Castellani Umberto
Melzi Simone
Roffo Giorgio
Vinciarelli Alessandro
Publication venue
Publication date: 01/01/2017
Field of study

Feature selection is playing an increasingly significant role with respect to many computer vision applications spanning from object recognition to visual object tracking. However, most of the recent solutions in feature selection are not robust across different and heterogeneous set of data. In this paper, we address this issue proposing a robust probabilistic latent graph-based feature selection algorithm that performs the ranking step while considering all the possible subsets of features, as paths on a graph, bypassing the combinatorial problem analytically. An appealing characteristic of the approach is that it aims to discover an abstraction behind low-level sensory data, that is, relevancy. Relevancy is modelled as a latent variable in a PLSA-inspired generative process that allows the investigation of the importance of a feature when injected into an arbitrary set of cues. The proposed method has been tested on ten diverse benchmarks, and compared against eleven state of the art feature selection methods. Results show that the proposed approach attains the highest performance levels across many different scenarios and difficulties, thereby confirming its strong robustness while setting a new state of the art in feature selection domain.Comment: Accepted at the IEEE International Conference on Computer Vision (ICCV), 2017, Venice. Preprint cop

arXiv.org e-Print Archive

Crossref

Catalogo dei prodotti della ricerca

Enlighten

Archivio della ricerca- Università di Roma La Sapienza

Dynamic feature selection for clustering high dimensional data streams

Author: Fahy Conor
Yang Shengxiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/07/2019
Field of study

open access articleChange in a data stream can occur at the concept level and at the feature level. Change at the feature level can occur if new, additional features appear in the stream or if the importance and relevance of a feature changes as the stream progresses. This type of change has not received as much attention as concept-level change. Furthermore, a lot of the methods proposed for clustering streams (density-based, graph-based, and grid-based) rely on some form of distance as a similarity metric and this is problematic in high-dimensional data where the curse of dimensionality renders distance measurements and any concept of “density” difficult. To address these two challenges we propose combining them and framing the problem as a feature selection problem, specifically a dynamic feature selection problem. We propose a dynamic feature mask for clustering high dimensional data streams. Redundant features are masked and clustering is performed along unmasked, relevant features. If a feature's perceived importance changes, the mask is updated accordingly; previously unimportant features are unmasked and features which lose relevance become masked. The proposed method is algorithm-independent and can be used with any of the existing density-based clustering algorithms which typically do not have a mechanism for dealing with feature drift and struggle with high-dimensional data. We evaluate the proposed method on four density-based clustering algorithms across four high-dimensional streams; two text streams and two image streams. In each case, the proposed dynamic feature mask improves clustering performance and reduces the processing time required by the underlying algorithm. Furthermore, change at the feature level can be observed and tracked

De Montfort University Open Research Archive