19 research outputs found
Unsupervised Feature Selection with Adaptive Structure Learning
The problem of feature selection has raised considerable interests in the
past decade. Traditional unsupervised methods select the features which can
faithfully preserve the intrinsic structures of data, where the intrinsic
structures are estimated using all the input features of data. However, the
estimated intrinsic structures are unreliable/inaccurate when the redundant and
noisy features are not removed. Therefore, we face a dilemma here: one need the
true structures of data to identify the informative features, and one need the
informative features to accurately estimate the true structures of data. To
address this, we propose a unified learning framework which performs structure
learning and feature selection simultaneously. The structures are adaptively
learned from the results of feature selection, and the informative features are
reselected to preserve the refined structures of data. By leveraging the
interactions between these two essential tasks, we are able to capture accurate
structures and select more informative features. Experimental results on many
benchmark data sets demonstrate that the proposed method outperforms many state
of the art unsupervised feature selection methods
DSL: Discriminative Subgraph Learning via Sparse Self-Representation
The goal in network state prediction (NSP) is to classify the global state
(label) associated with features embedded in a graph. This graph structure
encoding feature relationships is the key distinctive aspect of NSP compared to
classical supervised learning. NSP arises in various applications: gene
expression samples embedded in a protein-protein interaction (PPI) network,
temporal snapshots of infrastructure or sensor networks, and fMRI coherence
network samples from multiple subjects to name a few. Instances from these
domains are typically ``wide'' (more features than samples), and thus, feature
sub-selection is required for robust and generalizable prediction. How to best
employ the network structure in order to learn succinct connected subgraphs
encompassing the most discriminative features becomes a central challenge in
NSP. Prior work employs connected subgraph sampling or graph smoothing within
optimization frameworks, resulting in either large variance of quality or weak
control over the connectivity of selected subgraphs.
In this work we propose an optimization framework for discriminative subgraph
learning (DSL) which simultaneously enforces (i) sparsity, (ii) connectivity
and (iii) high discriminative power of the resulting subgraphs of features. Our
optimization algorithm is a single-step solution for the NSP and the associated
feature selection problem. It is rooted in the rich literature on
maximal-margin optimization, spectral graph methods and sparse subspace
self-representation. DSL simultaneously ensures solution interpretability and
superior predictive power (up to 16% improvement in challenging instances
compared to baselines), with execution times up to an hour for large instances.Comment: 9 page
A Matlab Toolbox for Feature Importance Ranking
More attention is being paid for feature importance ranking (FIR), in
particular when thousands of features can be extracted for intelligent
diagnosis and personalized medicine. A large number of FIR approaches have been
proposed, while few are integrated for comparison and real-life applications.
In this study, a matlab toolbox is presented and a total of 30 algorithms are
collected. Moreover, the toolbox is evaluated on a database of 163 ultrasound
images. To each breast mass lesion, 15 features are extracted. To figure out
the optimal subset of features for classification, all combinations of features
are tested and linear support vector machine is used for the malignancy
prediction of lesions annotated in ultrasound images. At last, the
effectiveness of FIR is analyzed according to performance comparison. The
toolbox is online (https://github.com/NicoYuCN/matFIR). In our future work,
more FIR methods, feature selection methods and machine learning classifiers
will be integrated
Unsupervised Feature Selection Algorithm via Local Structure Learning and Kernel Function
In order to reduce dimensionality of high-dimensional data, a series of feature selection algorithms have been proposed. But these algorithms have the following disadvantages: (1) they do not fully consider the nonlinear relationship between data features (2) they do not consider the similarity between data features. To solve the above two problems, we propose an unsupervised feature selection algorithm based on local structure learning and kernel function. First, through the kernel function, we map each feature of the data to the kernel space, so that the nonlinear relationship of the data features can be fully exploited. Secondly, we apply the theory of local structure learning to the features of data, so that the similarity of data features is considered. Then we added a low rank constraint to consider the global information of the data. Finally, we add sparse learning to make feature selection. The experimental results show that the proposed algorithm has better results than the comparison methods