3 research outputs found

    Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications

    Get PDF
    Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies

    Spectral Textile Detection in the VNIR/SWIR Band

    Get PDF
    Dismount detection, the detection of persons on the ground and outside of a vehicle, has applications in search and rescue, security, and surveillance. Spatial dismount detection methods lose e effectiveness at long ranges, and spectral dismount detection currently relies on detecting skin pixels. In scenarios where skin is not exposed, spectral textile detection is a more effective means of detecting dismounts. This thesis demonstrates the effectiveness of spectral textile detectors on both real and simulated hyperspectral remotely sensed data. Feature selection methods determine sets of wavebands relevant to spectral textile detection. Classifiers are trained on hyperspectral contact data with the selected wavebands, and classifier parameters are optimized to improve performance on a training set. Classifiers with optimized parameters are used to classify contact data with artificially added noise and remotely-sensed hyperspectral data. The performance of optimized classifiers on hyperspectral data is measured with Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve. The best performances on the contact data are 0.892 and 0.872 for Multilayer Perceptrons (MLPs) and Support Vector Machines (SVMs), respectively. The best performances on the remotely-sensed data are AUC = 0.947 and AUC = 0.970 for MLPs and SVMs, respectively. The difference in classifier performance between the contact and remotely-sensed data is due to the greater variety of textiles represented in the contact data. Spectral textile detection is more reliable in scenarios with a small variety of textiles

    Granular Support Vector Machines Based on Granular Computing, Soft Computing and Statistical Learning

    Get PDF
    With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems. In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challenging predictive data modeling problems effectively and/or efficiently, with specific focus on binary classification problems. In general, GSVM works in 3 steps. Step 1 is granulation to build a sequence of information granules from the original dataset or from the original feature space. Step 2 is modeling Support Vector Machines (SVM) in some of these information granules when necessary. Finally, step 3 is aggregation to consolidate information in these granules at suitable abstract level. A good granulation method to find suitable granules is crucial for modeling a good GSVM. Under this framework, many different granulation algorithms including the GSVM-CMW (cumulative margin width) algorithm, the GSVM-AR (association rule mining) algorithm, a family of GSVM-RFE (recursive feature elimination) algorithms, the GSVM-DC (data cleaning) algorithm and the GSVM-RU (repetitive undersampling) algorithm are designed for binary classification problems with different characteristics. The empirical studies in biomedical domain and many other application domains demonstrate that the framework is promising. As a preliminary step, this dissertation work will be extended in the future to build a Granular Computing based Predictive Data Modeling framework (GrC-PDM) with which we can create hybrid adaptive intelligent data mining systems for high quality prediction
    corecore