Search CORE

82,330 research outputs found

Recommended from our members

Modular feature selection using relative importance factors

Author: Guan SU
Li P
Zhu F
Publication venue: World Scientific Publishing House Ltd
Publication date: 01/03/2004
Field of study

Feature selection plays an important role in finding relevant or irrelevant features in classification. Genetic algorithms (GAs) have been used as conventional methods for classifiers to adaptively evolve solutions for classification problems. In this paper, we explore the use of feature selection in modular GA-based classification. We propose a new feature selection technique, Relative Importance Factor (RIF), to find irrelevant features in the feature space of each module. By removing these features, we aim to improve classification accuracy and reduce the dimensionality of classification problems. Benchmark classification data sets are used to evaluate the proposed approaches. The experiment results show that RIF can be used to determine irrelevant features and help achieve higher classification accuracy with the feature space dimension reduced. The complexity of the resulting rule sets is also reduced which means the modular classifiers with irrelevant features removed will be able to classify data with a higher throughput

Brunel University Research Archive

Feature selection for modular GA-based classification

Author: Anand
Battiti
Brameier
Falco
Fangming Zhu
Gonzalez
Guan
Guan
Ishibuchi
Jenkins
Kwak
Lerner
Lu
Pal
Setiono
Setnes
Steven Guan
Verikas
Publication venue: 'Elsevier BV'
Publication date: 01/09/2004
Field of study

Genetic algorithms (GAs) have been used as conventional methods for classifiers to adaptively evolve solutions for classification problems. Feature selection plays an important role in finding relevant features in classification. In this paper, feature selection is explored with modular GA-based classification. A new feature selection technique, Relative Importance Factor (RIF), is proposed to find less relevant features in the input domain of each class module. By removing these features, it is aimed to reduce the classification error and dimensionality of classification problems. Benchmark classification data sets are used to evaluate the proposed approach. The experiment results show that RIF can be used to find less relevant features and help achieve lower classification error with the feature space dimension reduced

Crossref

Brunel University Research Archive

ScholarBank@NUS

Recommended from our members

Advances in manufacturing technology – XXII

Author: Cheng K
Harrison DJ
Makatsoris H
Publication venue: Brunel University
Publication date: 01/01/2008
Field of study

Brunel University Research Archive

The Five Factor Model of personality and evaluation of drug consumption risk

Author: A. Terracciano
A.N. Gorban
A.N. Gorban
A.N. Kopstein
C.A. Ventura
D.N. Gujarati
D.W. Hosmer Jr
D.W. Scott
E.M. Mirkes
E.M. Mirkes
F. Bulut
G. Biau
G.P. McCabe
H.F. Kaiser
I.D. Dinov
J. Hoare
J.R. Quinlan
K. Pearson
K.L. Clarkson
L. Guttman
M. Linting
M. Zuckerman
M.J. Cleveland
M.S. Stanford
P.T. Costa
Q. Li
R. Beaglehole
R.A. Fisher
R.R. McCrae
S. Arlot
S. Russell
S. Valeroa
S.Y. Lee
T. Bogg
T. Hastie
T. Hastie
V. Egan
Y. Benjamini
Y. Koren
Publication venue
Publication date: 15/01/2017
Field of study

The problem of evaluating an individual's risk of drug consumption and misuse is highly important. An online survey methodology was employed to collect data including Big Five personality traits (NEO-FFI-R), impulsivity (BIS-11), sensation seeking (ImpSS), and demographic information. The data set contained information on the consumption of 18 central nervous system psychoactive drugs. Correlation analysis demonstrated the existence of groups of drugs with strongly correlated consumption patterns. Three correlation pleiades were identified, named by the central drug in the pleiade: ecstasy, heroin, and benzodiazepines pleiades. An exhaustive search was performed to select the most effective subset of input features and data mining methods to classify users and non-users for each drug and pleiad. A number of classification methods were employed (decision tree, random forest,

k

-nearest neighbors, linear discriminant analysis, Gaussian mixture, probability density function estimation, logistic regression and na{\"i}ve Bayes) and the most effective classifier was selected for each drug. The quality of classification was surprisingly high with sensitivity and specificity (evaluated by leave-one-out cross-validation) being greater than 70\% for almost all classification tasks. The best results with sensitivity and specificity being greater than 75\% were achieved for cannabis, crack, ecstasy, legal highs, LSD, and volatile substance abuse (VSA).Comment: Significantly extended report with 67 pages, 27 tables, 21 figure

arXiv.org e-Print Archive

Crossref

Leicester Research Archive

Transfer Learning via Contextual Invariants for One-to-Many Cross-Domain Recommendation

Author: Bendre Mangesh
Das Mahashweta
Krishnan Adit
Sundaram Hari
Yang Hao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/05/2020
Field of study

The rapid proliferation of new users and items on the social web has aggravated the gray-sheep user/long-tail item challenge in recommender systems. Historically, cross-domain co-clustering methods have successfully leveraged shared users and items across dense and sparse domains to improve inference quality. However, they rely on shared rating data and cannot scale to multiple sparse target domains (i.e., the one-to-many transfer setting). This, combined with the increasing adoption of neural recommender architectures, motivates us to develop scalable neural layer-transfer approaches for cross-domain learning. Our key intuition is to guide neural collaborative filtering with domain-invariant components shared across the dense and sparse domains, improving the user and item representations learned in the sparse domains. We leverage contextual invariances across domains to develop these shared modules, and demonstrate that with user-item interaction context, we can learn-to-learn informative representation spaces even with sparse interaction data. We show the effectiveness and scalability of our approach on two public datasets and a massive transaction dataset from Visa, a global payments technology company (19% Item Recall, 3x faster vs. training separate models for each domain). Our approach is applicable to both implicit and explicit feedback settings.Comment: SIGIR 202

arXiv.org e-Print Archive

Crossref