2,661 research outputs found
Ranking to Learn: Feature Ranking and Selection via Eigenvector Centrality
In an era where accumulating data is easy and storing it inexpensive, feature
selection plays a central role in helping to reduce the high-dimensionality of
huge amounts of otherwise meaningless data. In this paper, we propose a
graph-based method for feature selection that ranks features by identifying the
most important ones into arbitrary set of cues. Mapping the problem on an
affinity graph-where features are the nodes-the solution is given by assessing
the importance of nodes through some indicators of centrality, in particular,
the Eigen-vector Centrality (EC). The gist of EC is to estimate the importance
of a feature as a function of the importance of its neighbors. Ranking central
nodes individuates candidate features, which turn out to be effective from a
classification point of view, as proved by a thoroughly experimental section.
Our approach has been tested on 7 diverse datasets from recent literature
(e.g., biological data and object recognition, among others), and compared
against filter, embedded and wrappers methods. The results are remarkable in
terms of accuracy, stability and low execution time.Comment: Preprint version - Lecture Notes in Computer Science - Springer 201
FOHC: Firefly Optimizer Enabled Hybrid approach for Cancer Classification
Early detection and prediction of cancer, a group of chronic diseases responsible for a large number of deaths each year and a serious public health hazard, can lead to more effective treatment at an earlier stage in the disease's progression. In the current era, machine learning (ML) has widely been used to develop predictive models for incurable diseases such as cancer, heart disease, and diabetes, among others, taking into account both existing datasets and personally collected datasets, more research is still being conducted in this area. Using recursive feature elimination (RFE), principal component analysis (PCA), the Firefly Algorithm (FA), and a support vector machine (SVM) classifier, this study proposed a Firefly Optimizer-enabled Hybrid approach for Cancer classification (FOHC). This study considers feature selection and dimensionality reduction techniques RFE and PCA, and FA is used as the optimization algorithm. In the last stage, the SVM is applied to the pre-processed dataset as the classifier. To evaluate the proposed model, empirical analysis has been carried out on three different kinds of cancer disease datasets including Brain, Breast, and Lung cancer obtained from the UCI-ML warehouse. Based on the various performance parameters like accuracy, error rate, precision, recall, f-measure, etc., some experiments are carried out on the Jupyter platform using Python codes. This proposed model, FOHC, surpasses previous methods and other considered state-of-the-art works, with 98.94% accuracy for Breast cancer, 95.58% accuracy for Lung cancer, and 96.34% accuracy for Brain cancer. The outcomes of these experiments represent the effectiveness of the proposed work
A survey of feature selection in Internet traffic characterization
In the last decade, the research community has focused on new classification methods that rely on statistical characteristics of Internet traffic, instead of pre-viously popular port-number-based or payload-based methods, which are under even bigger constrictions. Some research works based on statistical characteristics generated large fea-ture sets of Internet traffic; however, nowadays it?s impossible to handle hun-dreds of features in big data scenarios, only leading to unacceptable processing time and misleading classification results due to redundant and correlative data. As a consequence, a feature selection procedure is essential in the process of Internet traffic characterization. In this paper a survey of feature selection methods is presented: feature selection frameworks are introduced, and differ-ent categories of methods are briefly explained and compared; several proposals on feature selection in Internet traffic characterization are shown; finally, future application of feature selection to a concrete project is proposed
Using Feature Selection Methods to Discover Common Users’ Preferences for Online Recommender Systems
Recommender systems have taken over user’s choice to choose the items/services they want from online markets, where lots of merchandise is traded. Collaborative filtering-based recommender systems uses user opinions and preferences. Determination of commonly used attributes that influence preferences used for prediction and subsequent recommendation of unknown or new items to users is a significant objective while developing recommender engines. In conventional systems, study of user behavior to know their dis/like over items would be carried-out. In this paper, presents feature selection methods to mine such preferences through selection of high influencing attributes of the items. In machine learning, feature selection is used as a data pre-processing method but extended its use on this work to achieve two objectives; removal of redundant, uninformative features and for selecting formative, relevant features based on the response variable. The latter objective, was suggested to identify and determine the frequent and shared features that would be preferred mostly by marketplace online users as they express their preferences. The dataset used for experimentation and determination was synthetic dataset.  The Jupyter Notebook™ using python was used to run the experiments. Results showed that given a number of formative features, there were those selected, with high influence to the response variable. Evidence showed that different feature selection methods resulted with different feature scores, and intrinsic method had the best overall results with 85% model accuracy. Selected features were used as frequently preferred attributes that influence users’ preferences
Using Feature Selection Methods to Discover Common Users’ Preferences for Online Recommender Systems
Recommender systems have taken over user’s choice to choose the items/services they want from online markets, where lots of merchandise is traded. Collaborative filtering-based recommender systems uses user opinions and preferences. Determination of commonly used attributes that influence preferences used for prediction and subsequent recommendation of unknown or new items to users is a significant objective while developing recommender engines. In conventional systems, study of user behavior to know their dis/like over items would be carried-out. In this paper, presents feature selection methods to mine such preferences through selection of high influencing attributes of the items. In machine learning, feature selection is used as a data pre-processing method but extended its use on this work to achieve two objectives; removal of redundant, uninformative features and for selecting formative, relevant features based on the response variable. The latter objective, was suggested to identify and determine the frequent and shared features that would be preferred mostly by marketplace online users as they express their preferences. The dataset used for experimentation and determination was synthetic dataset.  The Jupyter Notebook™ using python was used to run the experiments. Results showed that given a number of formative features, there were those selected, with high influence to the response variable. Evidence showed that different feature selection methods resulted with different feature scores, and intrinsic method had the best overall results with 85% model accuracy. Selected features were used as frequently preferred attributes that influence users’ preferences
PRZEGLĄD METOD SELEKCJI CECH UŻYWANYCH W DIAGNOSTYCE CZERNIAKA
Currently, a large number of trait selection methods are used. They are becoming more and more of interest among researchers. Some of the methods are of course used more frequently. The article describes the basics of selection-based algorithms. FS methods fall into three categories: filter wrappers, embedded methods. Particular attention was paid to finding examples of applications of the described methods in the diagnosisof skin melanoma.Obecnie stosuje się wiele metod selekcji cech. Cieszą się coraz większym zainteresowaniem badaczy. Oczywiście niektóre metody są stosowane częściej. W artykule zostały opisane podstawy działania algorytmów opartych na selekcji. Metody selekcji cech należące dzielą się na trzy kategorie: metody filtrowe, metody opakowujące, metody wbudowane. Zwrócono szczególnie uwagę na znalezienie przykładów zastosowań opisanych metod w diagnostyce czerniaka skóry
Multi-agent evolutionary systems for the generation of complex virtual worlds
Modern films, games and virtual reality applications are dependent on
convincing computer graphics. Highly complex models are a requirement for the
successful delivery of many scenes and environments. While workflows such as
rendering, compositing and animation have been streamlined to accommodate
increasing demands, modelling complex models is still a laborious task. This
paper introduces the computational benefits of an Interactive Genetic Algorithm
(IGA) to computer graphics modelling while compensating the effects of user
fatigue, a common issue with Interactive Evolutionary Computation. An
intelligent agent is used in conjunction with an IGA that offers the potential
to reduce the effects of user fatigue by learning from the choices made by the
human designer and directing the search accordingly. This workflow accelerates
the layout and distribution of basic elements to form complex models. It
captures the designer's intent through interaction, and encourages playful
discovery
- …