648 research outputs found
Feature Selection with Evolving, Fast and Slow Using Two Parallel Genetic Algorithms
Feature selection is one of the most challenging issues in machine learning,
especially while working with high dimensional data. In this paper, we address
the problem of feature selection and propose a new approach called Evolving
Fast and Slow. This new approach is based on using two parallel genetic
algorithms having high and low mutation rates, respectively. Evolving Fast and
Slow requires a new parallel architecture combining an automatic system that
evolves fast and an effortful system that evolves slow. With this architecture,
exploration and exploitation can be done simultaneously and in unison. Evolving
fast, with high mutation rate, can be useful to explore new unknown places in
the search space with long jumps; and Evolving Slow, with low mutation rate,
can be useful to exploit previously known places in the search space with short
movements. Our experiments show that Evolving Fast and Slow achieves very good
results in terms of both accuracy and feature elimination
Reduction of the size of datasets by using evolutionary feature selection: the case of noise in a modern city
Smart city initiatives have emerged to mitigate the negative effects of a very fast growth of urban areas. Most of the population in our cities are exposed to high levels of noise that generate discomfort and different health problems. These issues may be mitigated by applying different smart cities solutions, some of them require high accurate noise information to provide the best quality of serve possible. In this study, we have designed a machine learning approach based on genetic algorithms to analyze noise data captured in the university campus. This method reduces the amount of data required to classify the noise by addressing a feature selection optimization problem. The experimental results have shown that our approach improved the accuracy in 20% (achieving an accuracy of 87% with a reduction of up to 85% on the original dataset).Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech.
This research has been partially funded by the Spanish MINECO and FEDER projects TIN2016-81766-REDT (http://cirti.es), and TIN2017-88213-R (http://6city.lcc.uma.es)
Employee turnover prediction and retention policies design: a case study
This paper illustrates the similarities between the problems of customer
churn and employee turnover. An example of employee turnover prediction model
leveraging classical machine learning techniques is developed. Model outputs
are then discussed to design \& test employee retention policies. This type of
retention discussion is, to our knowledge, innovative and constitutes the main
value of this paper
DSL: Discriminative Subgraph Learning via Sparse Self-Representation
The goal in network state prediction (NSP) is to classify the global state
(label) associated with features embedded in a graph. This graph structure
encoding feature relationships is the key distinctive aspect of NSP compared to
classical supervised learning. NSP arises in various applications: gene
expression samples embedded in a protein-protein interaction (PPI) network,
temporal snapshots of infrastructure or sensor networks, and fMRI coherence
network samples from multiple subjects to name a few. Instances from these
domains are typically ``wide'' (more features than samples), and thus, feature
sub-selection is required for robust and generalizable prediction. How to best
employ the network structure in order to learn succinct connected subgraphs
encompassing the most discriminative features becomes a central challenge in
NSP. Prior work employs connected subgraph sampling or graph smoothing within
optimization frameworks, resulting in either large variance of quality or weak
control over the connectivity of selected subgraphs.
In this work we propose an optimization framework for discriminative subgraph
learning (DSL) which simultaneously enforces (i) sparsity, (ii) connectivity
and (iii) high discriminative power of the resulting subgraphs of features. Our
optimization algorithm is a single-step solution for the NSP and the associated
feature selection problem. It is rooted in the rich literature on
maximal-margin optimization, spectral graph methods and sparse subspace
self-representation. DSL simultaneously ensures solution interpretability and
superior predictive power (up to 16% improvement in challenging instances
compared to baselines), with execution times up to an hour for large instances.Comment: 9 page
- …