Search CORE

4,395 research outputs found

An Interval-based Multiobjective Approach to Feature Subset Selection Using Joint Modeling of Objectives and Variables

Author: Bielza Concha
Karshenas Hossein
Larrañaga Múgica Pedro
Zhang Qingfu
Publication venue: Facultad de Informática (UPM)
Publication date: 01/12/2012
Field of study

This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets

Archivo Digital UPM

A survey on computational intelligence approaches for predictive modeling in prostate cancer

Author: A. Graham Pockley
Alexey
Arevalo
Arevalo
Azizi
Balachandran
Benecchi
Bengio
Bianchi
Black
Bourdes
Breiman
Castanho
Chadha
Choukroun
Ciresan
Coates
Cortes
Cosma
Dass
David Brown
Diaz
Dice
Djavan
Dorigo
Doyle
Ecke
Edge
Elbeltagi
Eldefrawy
Elkin
Fan
Fei
Frantzi
Froelich
Gaul
Georgina Cosma
Gertych
Ghahramani
Golugula
Goulionis
Greenblatt
Guo
Guo
Hameed
Han
Handl
Haq
Hassan
Hastie
Hinton
Holland
Jerne
Jiang
Kachitvichyanukul
Karakiewicz
Kawakami
Keegan
Keles
Kennedy
Kim
Kohavi
Kosko
Kumar
Kuo
Kuo
Lacave
Larranaga
LeCun
Lee
Lee
Lehaire
Liang
Luque-Baena
Martens
Masood Khan
Matthew Archer
Matulewicz
Mazzetti
McGeachy
Mikolov
Mohareri
Monaco
Mumford
Ni
Ospina
Parpinelli
Peng
Pieczynski
Polikar
Ross
Russell
Sadoughi
Salakhutdinov
Sanyal
Saritas
Shariat
Shariat
Shi
Shi
Singireddy
Smith
Sonnenberg
Stephan
Strobl
Suk
Sun
Sung
Swan
Tang
Teodorovic
Tewari
Thangavel
Thompson
Tong
Took
Tsamardinos
Underwood
Vijaya
Vos
Waljee
Wang
Xue
Xutao
Yang
Yu
Yu
Zadeh
Çinar
Publication venue: 'Elsevier BV'
Publication date: 09/11/2016
Field of study

Predictive modeling in medicine involves the development of computational models which are capable of analysing large amounts of data in order to predict healthcare outcomes for individual patients. Computational intelligence approaches are suitable when the data to be modelled are too complex forconventional statistical techniques to process quickly and eciently. These advanced approaches are based on mathematical models that have been especially developed for dealing with the uncertainty and imprecision which is typically found in clinical and biological datasets. This paper provides a survey of recent work on computational intelligence approaches that have been applied to prostate cancer predictive modeling, and considers the challenges which need to be addressed. In particular, the paper considers a broad definition of computational intelligence which includes evolutionary algorithms (also known asmetaheuristic optimisation, nature inspired optimisation algorithms), Artificial Neural Networks, Deep Learning, Fuzzy based approaches, and hybrids of these,as well as Bayesian based approaches, and Markov models. Metaheuristic optimisation approaches, such as the Ant Colony Optimisation, Particle Swarm Optimisation, and Artificial Immune Network have been utilised for optimising the performance of prostate cancer predictive models, and the suitability of these approaches are discussed

Crossref

Nottingham Trent Institutional Repository (IRep)

The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures

Author: A Ivshina
Anne-Claire Haury
C Ambroise
C Fan
C Lai
C Sotiriou
C Sotiriou
F Reyal
G Abraham
H Zou
I Guyon
I Guyon
J Bi
J Mairal
J Wang
Jean-Philippe Vert
JPA Ioannidis
L Ein-Dor
L Ein-Dor
M Dai
Muy-Teck Teh
N Meinshausen
P Wirapati
Pierre Gestraud
R Kohavi
R Shen
R Simon
R Tibshirani
RA Irizarry
S Michiels
T Abeel
T Barrett
T Iwamoto
W Shi
Y Benjamini
Y Pawitan
Y Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 23/06/2011
Field of study

Motivation: Biomarker discovery from high-dimensional data is a crucial problem with enormous applications in biology and medicine. It is also extremely challenging from a statistical viewpoint, but surprisingly few studies have investigated the relative strengths and weaknesses of the plethora of existing feature selection methods. Methods: We compare 32 feature selection methods on 4 public gene expression datasets for breast cancer prognosis, in terms of predictive performance, stability and functional interpretability of the signatures they produce. Results: We observe that the feature selection method has a significant influence on the accuracy, stability and interpretability of signatures. Simple filter methods generally outperform more complex embedded or wrapper methods, and ensemble feature selection has generally no positive effect. Overall a simple Student's t-test seems to provide the best results. Availability: Code and data are publicly available at http://cbio.ensmp.fr/~ahaury/

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

HAL Descartes

HAL-MINES ParisTech

EapGAFS: Microarray Dataset for Ensemble Classification for Diseases Prediction

Author: Krishna Peddarapu Rama
Rajarajeswari Pothuraju
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/08/2022
Field of study

Microarray data stores the measured expression levels of thousands of genes simultaneously which helps the researchers to get insight into the biological and prognostic information. Cancer is a deadly disease that develops over time and involves the uncontrolled division of body cells. In cancer, many genes are responsible for cell growth and division. But different kinds of cancer are caused by a different set of genes. So to be able to better understand, diagnose and treat cancer, it is essential to know which of the genes in the cancer cells are working abnormally. The advances in data mining, machine learning, soft computing, and pattern recognition have addressed the challenges posed by the researchers to develop computationally effective models to identify the new class of disease and develop diagnostic or therapeutic targets. This paper proposed an Ensemble Aprior Gentic Algorithm Feature Selection (EapGAFS) for microarray dataset classification. The proposed algorithm comprises of the genetic algorithm implemented with aprior learning for the microarray attributes classification. The proposed EapGAFS uses the rule set mining in the genetic algorithm for the microarray dataset processing. Through framed rule set the proposed model extract the attribute features in the dataset. Finally, with the ensemble classifier model the microarray dataset were classified for the processing. The performance of the proposed EapGAFS is conventional classifiers for the collected microarray dataset of the breast cancer, Hepatities, diabeties, and bupa. The comparative analysis of the proposed EapGAFS with the conventional classifier expressed that the proposed EapGAFS exhibits improved performance in the microarray dataset classification. The performance of the proposed EapGAFS is improved ~4 – 6% than the conventional classifiers such as Adaboost and ensemble

International Journal on Recent and Innovation Trends in Computing and Communication

Novel Machine Learning Techniques for Micro-Array Data Classification

Author: Eman Ahmed
Iman El Azab
Neamat El Gayar
Publication venue: 'IntechOpen'
Publication date: 02/11/2011
Field of study

IntechOpen

An Ensemble Framework Coping with Instability in the Gene Selection Process

Author: Castellanos Garzón José Antonio
Corchado Rodríguez Juan Manuel
López Sánchez Daniel
Paz Santana Juan Francisco de
Ramos González Juan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/01/2018
Field of study

[EN] This paper proposes an ensemble framework for gene selection, which is aimed at addressing instability problems presented in the gene filtering task. The complex process of gene selection from gene expression data faces different instability problems from the informative gene subsets found by different filter methods. This makes the identification of significant genes by the experts difficult. The instability of results can come from filter methods, gene classifier methods, different datasets of the same disease and multiple valid groups of biomarkers. Even though there is a wide number of proposals, the complexity imposed by this problem remains a challenge today. This work proposes a framework involving five stages of gene filtering to discover biomarkers for diagnosis and classification tasks. This framework performs a process of stable feature selection, facing the problems above and, thus, providing a more suitable and reliable solution for clinical and research purposes. Our proposal involves a process of multistage gene filtering, in which several ensemble strategies for gene selection were added in such a way that different classifiers simultaneously assess gene subsets to face instability. Firstly, we apply an ensemble of recent gene selection methods to obtain diversity in the genes found (stability according to filter methods). Next, we apply an ensemble of known classifiers to filter genes relevant to all classifiers at a time (stability according to classification methods). The achieved results were evaluated in two different datasets of the same disease (pancreatic ductal adenocarcinoma), in search of stability according to the disease, for which promising results were achieved

Gestion del Repositorio Documental de la Universidad de Salamanca