Search CORE

92 research outputs found

Stable Feature Selection for Biomarker Discovery

Author: He Zengyou
Yu Weichuan
Publication venue
Publication date: 01/01/2010
Field of study

Feature selection techniques have been used as the workhorse in biomarker discovery applications for a long time. Surprisingly, the stability of feature selection with respect to sampling variations has long been under-considered. It is only until recently that this issue has received more and more attention. In this article, we review existing stable feature selection methods for biomarker discovery using a generic hierarchal framework. We have two objectives: (1) providing an overview on this new yet fast growing topic for a convenient reference; (2) categorizing existing methods under an expandable framework for future research and development

arXiv.org e-Print Archive

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository

Examining the Classification Accuracy of TSVMs with Feature Selection in Comparison with the GLAD Algorithm

Author: A Gommerman
A S M Yong
A Zien
C Harris
F Valafar
Hala Helmi
I Guyon
J Han
Jonathan M. Garibaldi
K Bennett
M A Shipp
M P Brown
R Collobert
R Zhang
S Abney
T Jaakkola
T Joachims
T Joachims
T R Golub
Uwe Aickelin
X Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Crossref

Personalised information modelling technologies for personalised medicine

Author: Hu Y
Kasabov N
Liang W
Publication venue: Springer
Publication date: 21/03/2014
Field of study

Personalised modelling offers a new and effective approach for the study in pattern recognition and knowledge discovery, especially for biomedical applications. The created models are more useful and informative for analysing and evaluating an individual data object for a given problem. Such models are also expected to achieve a higher degree of accuracy of prediction of outcome or classification than conventional systems and methodologies. Motivated by the concept of personalised medicine and utilising transductive reasoning, personalised modelling was recently proposed as a new method for knowledge discovery in biomedical applications. Personalised modelling aims to create a unique computational diagnostic or prognostic model for an individual. Here we introduce an integrated method for personalised modelling that applies global optimisation of variables (features) and an appropriate size of neighbourhood to create an accurate personalised model for an individual. This method creates an integrated computational system that combines different information processing techniques, applied at different stages of data analysis, e.g. feature selection, classification, discovering the interaction of genes, outcome prediction, personalised profiling and visualisation, etc. It allows for adaptation, monitoring and improvement of an individual’s model and leads to improved accuracy and unique personalised profiling that could be used for personalised treatment and personalised drug design

AUT Scholarly Commons

Using random forest for reliable classification and cost-sensitive learning for medical diagnosis

Author: Cai W. W.
Lin C. D.
Mi H.
Wang H. Z.
Yang F.
林成德
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. Results: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. Conclusion: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class

Crossref

Springer - Publisher Connector

PubMed Central

Xiamen University Institutional Repository

Statistical Learning and Kernel Methods in Bioinformatics

Author: Guyon I.
Schölkopf B.
Weston J.
Publication venue
Publication date: 01/01/2003
Field of study

MPG.PuRe

Bioinformatics: a knowledge engineering approach

Author: Kasabov N
Publication venue: IEEE
Publication date: 27/05/2009
Field of study

The paper introduces the knowledge engineering (KE) approach for the modeling and the discovery of new knowledge in bioinformatics. This approach extends the machine learning approach with various rule extraction and other knowledge representation procedures. Examples of the KE approach, and especially of one of the recently developed techniques - evolving connectionist systems (ECOS), to challenging problems in bioinformatics are given, that include: DNA sequence analysis, microarray gene expression profiling, protein structure prediction, finding gene regulatory networks, medical prognostic systems, computational neurogenetic modeling

AUT Scholarly Commons

Network modeling of patients' biomolecular profiles for clinical phenotype/outcome prediction

Author: A. Paccanaro
A. Petrini
E. Casiraghi
E. Vergani
G. Grossi
G. Valentini
J. Gliozzo
M. Frasca
M. Mesiti
M. Re
P. Perlasca
V. Vallacchi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Methods for phenotype and outcome prediction are largely based on inductive supervised models that use selected biomarkers to make predictions, without explicitly considering the functional relationships between individuals. We introduce a novel network-based approach named Patient-Net (P-Net) in which biomolecular profiles of patients are modeled in a graph-structured space that represents gene expression relationships between patients. Then a kernel-based semi-supervised transductive algorithm is applied to the graph to explore the overall topology of the graph and to predict the phenotype/clinical outcome of patients. Experimental tests involving several publicly available datasets of patients afflicted with pancreatic, breast, colon and colorectal cancer show that our proposed method is competitive with state-of-the-art supervised and semi-supervised predictive systems. Importantly, P-Net also provides interpretable models that can be easily visualized to gain clues about the relationships between patients, and to formulate hypotheses about their stratification

AIR Universita degli studi di Milano