1 research outputs found

    Efficient selection of discriminative genes from microarray gene expression data for cancer diagnosis

    Full text link
    A new mutual information (MI)-based feature-selection method to solve the so-called large p and small n problem experienced in a microarray gene expression-based data is presented. First, a grid-based feature clustering algorithm is introduced to eliminate redundant features. A huge gene set is then greatly reduced in a very efficient way. As a result, the computational efficiency of the whole feature-selection process is substantially enhanced. Second, MI is directly estimated using quadratic MI together with Parzen window density estimators. This approach is able to deliver reliable results even when only a small pattern set is available. Also, a new MI-based criterion is proposed to avoid the highly redundant selection results in a systematic way. At last, attributed to the direct estimation of MI, the appropriate selected feature subsets can be reasonably determined. © 2005 IEEE
    corecore