45,735 research outputs found
An Ensemble Framework Coping with Instability in the Gene Selection Process
[EN] This paper proposes an ensemble framework for gene selection, which is aimed at addressing instability problems presented in the gene filtering task. The complex process of gene selection from gene expression data faces different instability problems from the informative gene subsets found by different filter methods. This makes the identification of significant genes by the experts difficult. The instability of results can come from filter methods, gene classifier methods, different datasets of the same disease and multiple valid groups of biomarkers. Even though there is a wide number of proposals, the complexity imposed by this problem remains a challenge today. This work proposes a framework involving five stages of gene filtering to discover biomarkers for diagnosis and classification tasks. This framework performs a process of stable feature selection, facing the problems above and, thus, providing a more suitable and reliable solution for clinical and research purposes. Our proposal involves a process of multistage gene filtering, in which several ensemble strategies for gene selection were added in such a way that different classifiers simultaneously assess gene subsets to face instability. Firstly, we apply an ensemble of recent gene selection methods to obtain diversity in the genes found (stability according to filter methods). Next, we apply an ensemble of known classifiers to filter genes relevant to all classifiers at a time (stability according to classification methods). The achieved results were evaluated in two different datasets of the same disease (pancreatic ductal adenocarcinoma), in search of stability according to the disease, for which promising results were achieved
Visual Integration of Data and Model Space in Ensemble Learning
Ensembles of classifier models typically deliver superior performance and can
outperform single classifier models given a dataset and classification task at
hand. However, the gain in performance comes together with the lack in
comprehensibility, posing a challenge to understand how each model affects the
classification outputs and where the errors come from. We propose a tight
visual integration of the data and the model space for exploring and combining
classifier models. We introduce a workflow that builds upon the visual
integration and enables the effective exploration of classification outputs and
models. We then present a use case in which we start with an ensemble
automatically selected by a standard ensemble selection algorithm, and show how
we can manipulate models and alternative combinations.Comment: 8 pages, 7 picture
CIXL2: A Crossover Operator for Evolutionary Algorithms Based on Population Features
In this paper we propose a crossover operator for evolutionary algorithms
with real values that is based on the statistical theory of population
distributions. The operator is based on the theoretical distribution of the
values of the genes of the best individuals in the population. The proposed
operator takes into account the localization and dispersion features of the
best individuals of the population with the objective that these features would
be inherited by the offspring. Our aim is the optimization of the balance
between exploration and exploitation in the search process. In order to test
the efficiency and robustness of this crossover, we have used a set of
functions to be optimized with regard to different criteria, such as,
multimodality, separability, regularity and epistasis. With this set of
functions we can extract conclusions in function of the problem at hand. We
analyze the results using ANOVA and multiple comparison statistical tests. As
an example of how our crossover can be used to solve artificial intelligence
problems, we have applied the proposed model to the problem of obtaining the
weight of each network in a ensemble of neural networks. The results obtained
are above the performance of standard methods
- …