Search CORE

24 research outputs found

Subspace learning-based graph regularized feature selection

Author: Jiao Licheng
Shang Ronghua
Stolkin Rustam
Wang Wenbing
Publication venue: 'Elsevier BV'
Publication date: 01/11/2016
Field of study

Crossref

University of Birmingham Research Portal

Clustering based feature selection using Partitioning Around Medoids (PAM)

Author: Ismi Dewi Pramudi
Murinto Murinto
Publication venue: 'Universitas Ahmad Dahlan, Kampus 3'
Publication date: 19/05/2020
Field of study

High-dimensional data contains a large number of features. With many features, high dimensional data requires immense computational resources, including space and time. Several studies indicate that not all features of high dimensional data are relevant to classification result. Dimensionality reduction is inevitable and is required due to classifier performance improvement. Several dimensionality reduction techniques were carried out, including feature selection techniques and feature extraction techniques. Sequential forward feature selection and backward feature selection are feature selection using the greedy approach. The heuristics approach is also applied in feature selection, using the Genetic Algorithm, PSO, and Forest Optimization Algorithm. PCA is the most well-known feature extraction method. Besides, other methods such as multidimensional scaling and linear discriminant analysis. In this work, a different approach is applied to perform feature selection. Cluster analysis based feature selection using Partitioning Around Medoids (PAM) clustering is carried out. Our experiment results showed that classification accuracy gained when using feature vectors' medoids to represent the original dataset is high, above 80%

Journal of Education and Learning (EduLearn)

UAD Journal Management System

Unsupervised Feature Selection with Adaptive Structure Learning

Author: Alelyani S.
He X.
Hou C.
Krzanowski W.
Li Z.
Liu J.
Liu X.
Nie F.
Nie F.
Qian M.
Takeuchi I.
Yang Y.
Zhao Z.
Publication venue
Publication date: 02/04/2015
Field of study

The problem of feature selection has raised considerable interests in the past decade. Traditional unsupervised methods select the features which can faithfully preserve the intrinsic structures of data, where the intrinsic structures are estimated using all the input features of data. However, the estimated intrinsic structures are unreliable/inaccurate when the redundant and noisy features are not removed. Therefore, we face a dilemma here: one need the true structures of data to identify the informative features, and one need the informative features to accurately estimate the true structures of data. To address this, we propose a unified learning framework which performs structure learning and feature selection simultaneously. The structures are adaptively learned from the results of feature selection, and the informative features are reselected to preserve the refined structures of data. By leveraging the interactions between these two essential tasks, we are able to capture accurate structures and select more informative features. Experimental results on many benchmark data sets demonstrate that the proposed method outperforms many state of the art unsupervised feature selection methods

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hybrid Multi Attribute Relation Method for Document Clustering for Information Mining

Author: ChandraMohan Dr. B.
Tejasree S.
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2022
Field of study

Text clustering has been widely utilized with the aim of partitioning speci?c documents’ collection into different subsets using homogeneity/heterogeneity criteria. It has also become a very complicated area of research, including pattern recognition, information retrieval, and text mining. In the applications of enterprises, information mining faces challenges due to the complex distribution of data by an enormous number of different sources. Most of these information sources are from different domains which create difficulties in identifying the relationships among the information. In this case, a single method for clustering limits related information, while enhancing computational overheadsand processing times. Hence, identifying suitable clustering models for unsupervised learning is a challenge, specifically in the case of MultipleAttributesin data distributions. In recent works attribute relation based solutions are given significant importance to suggest the document clustering. To enhance further, in this paper, Hybrid Multi Attribute Relation Methods (HMARs) are presented for attribute selections and relation analyses of co-clustering of datasets. The proposed HMARs allowanalysis of distributed attributes in documents in the form of probabilistic attribute relations using modified Bayesian mechanisms. It also provides solutionsfor identifying most related attribute model for the multiple attribute documents clustering accurately. An experimental evaluation is performed to evaluate the clustering purity and normalization of the information utilizing UCI Data repository which shows 25% better when compared with the previous techniques

International Journal of Communication Networks and Information Security (IJCNIS)