61,958 research outputs found
Joint Multi-view Unsupervised Feature Selection and Graph Learning
Despite the recent progress, the existing multi-view unsupervised feature
selection methods mostly suffer from two limitations. First, they generally
utilize either cluster structure or similarity structure to guide the feature
selection, neglecting the possibility of a joint formulation with mutual
benefits. Second, they often learn the similarity structure by either global
structure learning or local structure learning, lacking the capability of graph
learning with both global and local structural awareness. In light of this,
this paper presents a joint multi-view unsupervised feature selection and graph
learning (JMVFG) approach. Particularly, we formulate the multi-view feature
selection with orthogonal decomposition, where each target matrix is decomposed
into a view-specific basis matrix and a view-consistent cluster indicator.
Cross-space locality preservation is incorporated to bridge the cluster
structure learning in the projected space and the similarity learning (i.e.,
graph learning) in the original space. Further, a unified objective function is
presented to enable the simultaneous learning of the cluster structure, the
global and local similarity structures, and the multi-view consistency and
inconsistency, upon which an alternating optimization algorithm is developed
with theoretically proved convergence. Extensive experiments demonstrate the
superiority of our approach for both multi-view feature selection and graph
learning tasks
Online Unsupervised Multi-view Feature Selection
In the era of big data, it is becoming common to have data with multiple
modalities or coming from multiple sources, known as "multi-view data".
Multi-view data are usually unlabeled and come from high-dimensional spaces
(such as language vocabularies), unsupervised multi-view feature selection is
crucial to many applications. However, it is nontrivial due to the following
challenges. First, there are too many instances or the feature dimensionality
is too large. Thus, the data may not fit in memory. How to select useful
features with limited memory space? Second, how to select features from
streaming data and handles the concept drift? Third, how to leverage the
consistent and complementary information from different views to improve the
feature selection in the situation when the data are too big or come in as
streams? To the best of our knowledge, none of the previous works can solve all
the challenges simultaneously. In this paper, we propose an Online unsupervised
Multi-View Feature Selection, OMVFS, which deals with large-scale/streaming
multi-view data in an online fashion. OMVFS embeds unsupervised feature
selection into a clustering algorithm via NMF with sparse learning. It further
incorporates the graph regularization to preserve the local structure
information and help select discriminative features. Instead of storing all the
historical data, OMVFS processes the multi-view data chunk by chunk and
aggregates all the necessary information into several small matrices. By using
the buffering technique, the proposed OMVFS can reduce the computational and
storage cost while taking advantage of the structure information. Furthermore,
OMVFS can capture the concept drifts in the data streams. Extensive experiments
on four real-world datasets show the effectiveness and efficiency of the
proposed OMVFS method. More importantly, OMVFS is about 100 times faster than
the off-line methods
Unsupervised Feature Selection Via Orthogonal Basis Clustering and Local Structure Preserving
Due to the "curse of dimensionality" issue, how to discard redundant features and select informative features in high-dimensional data has become a critical problem, hence there are many research studies dedicated to solving this problem. Unsupervised feature selection technique, which does not require any prior category information to conduct with, has gained a prominent place in preprocessing high-dimensional data among all feature selection techniques, and it has been applied to many neural networks and learning systems related applications, e.g., pattern classification. In this article, we propose an efficient method for unsupervised feature selection via orthogonal basis clustering and reliable local structure preserving, which is referred to as OCLSP briefly. Our OCLSP method consists of an orthogonal basis clustering together with an adaptive graph regularization, which realizes the functionality of simultaneously achieving excellent cluster separation and preserving the local information of data. Besides, we exploit an efficient alternative optimization algorithm to solve the challenging optimization problem of our proposed OCLSP method, and we perform a theoretical analysis of its computational complexity and convergence. Eventually, we conduct comprehensive experiments on nine real-world datasets to test the validity of our proposed OCLSP method, and the experimental results demonstrate that our proposed OCLSP method outperforms many state-of-the-art unsupervised feature selection methods in terms of clustering accuracy and normalized mutual information, which indicates that our proposed OCLSP method has a strong ability in identifying more important features
Unsupervised Feature Selection with Adaptive Structure Learning
The problem of feature selection has raised considerable interests in the
past decade. Traditional unsupervised methods select the features which can
faithfully preserve the intrinsic structures of data, where the intrinsic
structures are estimated using all the input features of data. However, the
estimated intrinsic structures are unreliable/inaccurate when the redundant and
noisy features are not removed. Therefore, we face a dilemma here: one need the
true structures of data to identify the informative features, and one need the
informative features to accurately estimate the true structures of data. To
address this, we propose a unified learning framework which performs structure
learning and feature selection simultaneously. The structures are adaptively
learned from the results of feature selection, and the informative features are
reselected to preserve the refined structures of data. By leveraging the
interactions between these two essential tasks, we are able to capture accurate
structures and select more informative features. Experimental results on many
benchmark data sets demonstrate that the proposed method outperforms many state
of the art unsupervised feature selection methods
- …