    Integrating gene expression profiling and clinical data

    AbstractWe propose a combination of machine learning techniques to integrate predictive profiling from gene expression with clinical and epidemiological data. Starting from BioDCV, a complete software setup for predictive classification and feature ranking without selection bias, we apply semisupervised profiling for detecting outliers and deriving informative subtypes of patients. During the profiling process, sampletracking curves are extracted, and then clustered according to a distance derived from dynamic time warping. Sampletracking allows also the identification of outlier cases, whose removal is shown to improve predictive accuracy and stability of derived gene profiles. Here we propose to employ clinical features to validate the semisupervising procedure. The procedure is demonstrated in the analysis of a liver cancer dataset of 213 samples described by 1993 genes and by pathological features

    Multivariate classification of gene expression microarray data

    L'expressi贸dels gens obtinguts de l'an脿liside microarrays s'utilitza en molts casos, per classificar les c猫llules. En aquestatesi, unaversi贸probabil铆stica del m猫todeDiscriminant Partial Least Squares (p-DPLS)s'utilitza per classificar les mostres de les expressions delsseus gens. p-DPLS esbasa en la regla de Bayes de la probabilitat a posteriori. Aquestsclassificadorss贸nfora莽ats a classficarsempre.Per superaraquestalimitaci贸s'haimplementatl'opci贸 de rebuig.Aquestaopci贸permetrebutjarlesmostresamb alt riscd'errors de classificaci贸 (茅s a dir, mostresambig眉esi outliers).Aquestaopci贸 de rebuigcombinacriterisbasats en els residuals x, el leverage ielsvalorspredits. A m茅s,esdesenvolupa un m猫tode de selecci贸 de variables per triarels gens m茅srellevants, jaque la majoriadels gens analitzatsamb un microarrays贸nirrellevants per al prop貌sit particular de classificaci贸I podenconfondre el classificador. Finalment, el DPLSs'estenen a la classificaci贸 multi-classemitjan莽ant la combinaci贸 de PLS ambl'an脿lisidiscriminant lineal