82,330 research outputs found
Recommended from our members
Modular feature selection using relative importance factors
Feature selection plays an important role in finding relevant or irrelevant features in classification. Genetic algorithms (GAs) have been used as conventional methods for classifiers to adaptively evolve solutions for classification problems. In this paper, we explore the use of feature selection in modular GA-based classification. We propose a new feature selection technique, Relative Importance Factor (RIF), to find irrelevant features in the feature space of each module. By removing these features, we aim to improve classification accuracy and reduce the dimensionality of classification problems. Benchmark classification data sets are used to evaluate the proposed approaches. The experiment results show that RIF can be used to determine irrelevant features and help achieve higher classification accuracy with the feature space dimension reduced. The complexity of the resulting rule sets is also reduced which means the modular classifiers with irrelevant features removed will be able to classify data with a higher throughput
Feature selection for modular GA-based classification
Genetic algorithms (GAs) have been used as conventional methods for classifiers to adaptively evolve solutions for classification problems. Feature selection plays an important role in finding relevant features in classification. In this paper, feature selection is explored with modular GA-based classification. A new feature selection technique, Relative Importance Factor (RIF), is proposed to find less relevant features in the input domain of each class module. By removing these features, it is aimed to reduce the classification error and dimensionality of classification problems. Benchmark classification data sets are used to evaluate the proposed approach. The experiment results show that RIF can be used to find less relevant features and help achieve lower classification error with the feature space dimension reduced
The Five Factor Model of personality and evaluation of drug consumption risk
The problem of evaluating an individual's risk of drug consumption and misuse
is highly important. An online survey methodology was employed to collect data
including Big Five personality traits (NEO-FFI-R), impulsivity (BIS-11),
sensation seeking (ImpSS), and demographic information. The data set contained
information on the consumption of 18 central nervous system psychoactive drugs.
Correlation analysis demonstrated the existence of groups of drugs with
strongly correlated consumption patterns. Three correlation pleiades were
identified, named by the central drug in the pleiade: ecstasy, heroin, and
benzodiazepines pleiades. An exhaustive search was performed to select the most
effective subset of input features and data mining methods to classify users
and non-users for each drug and pleiad. A number of classification methods were
employed (decision tree, random forest, -nearest neighbors, linear
discriminant analysis, Gaussian mixture, probability density function
estimation, logistic regression and na{\"i}ve Bayes) and the most effective
classifier was selected for each drug. The quality of classification was
surprisingly high with sensitivity and specificity (evaluated by leave-one-out
cross-validation) being greater than 70\% for almost all classification tasks.
The best results with sensitivity and specificity being greater than 75\% were
achieved for cannabis, crack, ecstasy, legal highs, LSD, and volatile substance
abuse (VSA).Comment: Significantly extended report with 67 pages, 27 tables, 21 figure
Transfer Learning via Contextual Invariants for One-to-Many Cross-Domain Recommendation
The rapid proliferation of new users and items on the social web has
aggravated the gray-sheep user/long-tail item challenge in recommender systems.
Historically, cross-domain co-clustering methods have successfully leveraged
shared users and items across dense and sparse domains to improve inference
quality. However, they rely on shared rating data and cannot scale to multiple
sparse target domains (i.e., the one-to-many transfer setting). This, combined
with the increasing adoption of neural recommender architectures, motivates us
to develop scalable neural layer-transfer approaches for cross-domain learning.
Our key intuition is to guide neural collaborative filtering with
domain-invariant components shared across the dense and sparse domains,
improving the user and item representations learned in the sparse domains. We
leverage contextual invariances across domains to develop these shared modules,
and demonstrate that with user-item interaction context, we can learn-to-learn
informative representation spaces even with sparse interaction data. We show
the effectiveness and scalability of our approach on two public datasets and a
massive transaction dataset from Visa, a global payments technology company
(19% Item Recall, 3x faster vs. training separate models for each domain). Our
approach is applicable to both implicit and explicit feedback settings.Comment: SIGIR 202
- …