93,092 research outputs found
AutoEncoder Inspired Unsupervised Feature Selection
High-dimensional data in many areas such as computer vision and machine
learning tasks brings in computational and analytical difficulty. Feature
selection which selects a subset from observed features is a widely used
approach for improving performance and effectiveness of machine learning models
with high-dimensional data. In this paper, we propose a novel AutoEncoder
Feature Selector (AEFS) for unsupervised feature selection which combines
autoencoder regression and group lasso tasks. Compared to traditional feature
selection methods, AEFS can select the most important features by excavating
both linear and nonlinear information among features, which is more flexible
than the conventional self-representation method for unsupervised feature
selection with only linear assumptions. Experimental results on benchmark
dataset show that the proposed method is superior to the state-of-the-art
method.Comment: accepted by ICASSP 201
Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision
Feature selection is essential for effective visual recognition. We propose
an efficient joint classifier learning and feature selection method that
discovers sparse, compact representations of input features from a vast sea of
candidates, with an almost unsupervised formulation. Our method requires only
the following knowledge, which we call the \emph{feature sign}---whether or not
a particular feature has on average stronger values over positive samples than
over negatives. We show how this can be estimated using as few as a single
labeled training sample per class. Then, using these feature signs, we extend
an initial supervised learning problem into an (almost) unsupervised clustering
formulation that can incorporate new data without requiring ground truth
labels. Our method works both as a feature selection mechanism and as a fully
competitive classifier. It has important properties, low computational cost and
excellent accuracy, especially in difficult cases of very limited training
data. We experiment on large-scale recognition in video and show superior speed
and performance to established feature selection approaches such as AdaBoost,
Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.Comment: arXiv admin note: text overlap with arXiv:1411.771
Heterogeneous feature space based task selection machine for unsupervised transfer learning
© 2015 IEEE. Transfer learning techniques try to transfer knowledge from previous tasks to a new target task with either fewer training data or less training than traditional machine learning techniques. Since transfer learning cares more about relatedness between tasks and their domains, it is useful for handling massive data, which are not labeled, to overcome distribution and feature space gaps, respectively. In this paper, we propose a new task selection algorithm in an unsupervised transfer learning domain, called as Task Selection Machine (TSM). It goes with a key technical problem, i.e., feature mapping for heterogeneous feature spaces. An extended feature method is applied to feature mapping algorithm. Also, TSM training algorithm, which is main contribution for this paper, relies on feature mapping. Meanwhile, the proposed TSM finally meets the unsupervised transfer learning requirements and solves the unsupervised multi-task transfer learning issues conversely
An Unsupervised Based Stochastic Parallel Gradient Descent For Fcm Learning Algorithm With Feature Selection For Big Data
Huge amount of the dataset consists millions of explanation and thousands, hundreds of features, which straightforwardly carry their amount of terabytes level. Selection of these hundreds of features for computer visualization and medical imaging applications problems is solved by using learning algorithm in data mining methods such as clustering, classification and feature selection methods .Among them all of data mining algorithm clustering methods which efficiently group similar features and unsimilar features are grouped as one cluster ,in this paper present a novel unsupervised cluster learning methods for feature selection of big dataset samples. The proposed unsupervised cluster learning methods removing irrelevant and unimportant features through the FCM objective function. The performance of proposed unsupervised FCM learning algorithm is robustly precious via the initial centroid values and fuzzification parameter (m). Therefore, the selection of initial centroid for cluster is very important to improve feature selection results for big dataset samples. To carry out this process, propose a novel Stochastic Parallel Gradient Descent (SPGD) method to select initial centroid of clusters for FCM is automatically to speed up process to group similar features and improve the quality of the cluster. So the proposed clustering method is named as SPFCM clustering, where the fuzzification parameter (m) for cluster is optimized using Hybrid Particle Swarm with Genetic (HPSG) algorithm. The algorithm selects features by calculation of distance value between two feature samples via kernel learning for big dataset samples via unsupervised learning and is especially easy to apply. Experimentation work of the proposed SPFCM and existing clustering methods is experimented in UCI machine learning larger dataset samples, it shows that the proposed SPFCM clustering methods produces higher feature selection results when compare to existing feature selection clustering algorithms , and being computationally extremely well-organized.
DOI: 10.17762/ijritcc2321-8169.15072
The effect of noise and sample size on an unsupervised feature selection method for manifold learning
The research on unsupervised feature selection is scarce in comparison to that for supervised models, despite the fact that this is an important issue for many clustering problems. An unsupervised feature selection method for general Finite Mixture Models was recently proposed and subsequently extended to Generative Topographic Mapping (GTM), a manifold learning constrained mixture model that
provides data visualization. Some of the results of a previous partial assessment of this unsupervised feature selection method
for GTM suggested that its performance may be affected by insufficient sample size and by noisy data. In this brief study, we test in some detail such limitations of the method.Postprint (published version
Unsupervised feature selection for noisy data
Feature selection techniques are enormously applied in a variety of data analysis tasks in order to reduce the dimensionality. According to the type of learning, feature selection algorithms are categorized to: supervised or unsupervised. In unsupervised learning scenarios, selecting features is a much harder problem, due to the lack of class labels that would facilitate the search for relevant features. The selecting feature difficulty is amplified when the data is corrupted by different noises. Almost all traditional unsupervised feature selection methods are not robust against the noise in samples. These approaches do not have any explicit mechanism for detaching and isolating the noise thus they can not produce an optimal feature subset. In this article, we propose an unsupervised approach for feature selection on noisy data, called Robust Independent Feature Selection (RIFS). Specifically, we choose feature subset that contains most of the underlying information, using the same criteria as the Independent component analysis (ICA). Simultaneously, the noise is separated as an independent component. The isolation of representative noise samples is achieved using factor oblique rotation whereas noise identification is performed using factor pattern loadings. Extensive experimental results over divers real-life data sets have showed the efficiency and advantage of the proposed algorithm.We thankfully acknowledge the support of the Comision Interministerial de Ciencia y Tecnologa (CICYT) under contract No. TIN2015-65316-P which has partially funded this work.Peer ReviewedPostprint (author's final draft
- …