228 research outputs found

    Computational Intelligence Techniques in Visual Pattern Recognition

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications

    Get PDF
    Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies

    Aco-based feature selection algorithm for classification

    Get PDF
    Dataset with a small number of records but big number of attributes represents a phenomenon called “curse of dimensionality”. The classification of this type of dataset requires Feature Selection (FS) methods for the extraction of useful information. The modified graph clustering ant colony optimisation (MGCACO) algorithm is an effective FS method that was developed based on grouping the highly correlated features. However, the MGCACO algorithm has three main drawbacks in producing a features subset because of its clustering method, parameter sensitivity, and the final subset determination. An enhanced graph clustering ant colony optimisation (EGCACO) algorithm is proposed to solve the three (3) MGCACO algorithm problems. The proposed improvement includes: (i) an ACO feature clustering method to obtain clusters of highly correlated features; (ii) an adaptive selection technique for subset construction from the clusters of features; and (iii) a genetic-based method for producing the final subset of features. The ACO feature clustering method utilises the ability of various mechanisms such as intensification and diversification for local and global optimisation to provide highly correlated features. The adaptive technique for ant selection enables the parameter to adaptively change based on the feedback of the search space. The genetic method determines the final subset, automatically, based on the crossover and subset quality calculation. The performance of the proposed algorithm was evaluated on 18 benchmark datasets from the University California Irvine (UCI) repository and nine (9) deoxyribonucleic acid (DNA) microarray datasets against 15 benchmark metaheuristic algorithms. The experimental results of the EGCACO algorithm on the UCI dataset are superior to other benchmark optimisation algorithms in terms of the number of selected features for 16 out of the 18 UCI datasets (88.89%) and the best in eight (8) (44.47%) of the datasets for classification accuracy. Further, experiments on the nine (9) DNA microarray datasets showed that the EGCACO algorithm is superior than the benchmark algorithms in terms of classification accuracy (first rank) for seven (7) datasets (77.78%) and demonstrates the lowest number of selected features in six (6) datasets (66.67%). The proposed EGCACO algorithm can be utilised for FS in DNA microarray classification tasks that involve large dataset size in various application domains

    Filter � GA Based Approach to Feature Selection for Classification

    Get PDF
    This paper presents a new approach to select reduced number of features in databases. Every database has a given number of features but it is observed that some of these features can be redundant and can be harmful as well as and can confuse the process of classification. The proposed method applies filter attribute measure and binary coded Genetic Algorithm to select a small subset of features. The importance of these features is judged by applying K-nearest neighbor (KNN) method of classification. The best reduced subset of features which has high classification accuracy on given databases is adopted. The classification accuracy obtained by proposed method is compared with that reported recently in publications on twenty eight databases. It is noted that proposed method performs satisfactory on these databases and achieves higher classification accuracy but with smaller number of features

    Reduced hyperBF networks : practical optimization, regularization, and applications in bioinformatics.

    Get PDF
    A hyper basis function network (HyperBF) is a generalized radial basis function network (RBF) where the activation function is a radial function of a weighted distance. The local weighting of the distance accounts for the variation in local scaling and discriminative power along each feature. Such generalization makes HyperBF networks capable of interpolating decision functions with high accuracy. However, such complexity makes HyperBF networks susceptible to overfitting. Moreover, training a HyperBF network demands weights, centers and local scaling factors to be optimized simultaneously. In the case of a relatively large dataset with a large network structure, such optimization becomes computationally challenging. In this work, a new regularization method that performs soft local dimension reduction and weight decay is presented. The regularized HyperBF (Reduced HyperBF) network is shown to provide classification accuracy comparable to a Support Vector Machines (SVM) while requiring a significantly smaller network structure. Furthermore, the soft local dimension reduction is shown to be informative for ranking features based on their localized discriminative power. In addition, a practical training approach for constructing HyperBF networks is presented. This approach uses hierarchal clustering to initialize neurons followed by a gradient optimization using a scaled Rprop algorithm with a localized partial backtracking step (iSRprop). Experimental results on a number of datasets show a faster and smoother convergence than the regular Rprop algorithm. The proposed Reduced HyperBF network is applied to two problems in bioinformatics. The first is the detection of transcription start sites (TSS) in human DNA. A novel method for improving the accuracy of TSS recognition for recently published methods is proposed. This method incorporates a new metric feature based on oligonucleotide positional frequencies. The second application is the accurate classification of microarray samples. A new feature selection algorithm based on a Reduced HyperBF network is proposed. The method is applied to two microarray datasets and is shown to select a minimal subset of features with high discriminative information. The algorithm is compared to two widely used methods and is shown to provide competitive results. In both applications, the final Reduced HyperBF network is used for higher level analysis. Significant neurons can indicate subpopulations, while local active features provide insight into the characteristics of the subpopulation in specific and the whole class in general

    Binary Multi-Verse Optimization (BMVO) Approaches for Feature Selection

    Get PDF
    Multi-Verse Optimization (MVO) is one of the newest meta-heuristic optimization algorithms which imitates the theory of Multi-Verse in Physics and resembles the interaction among the various universes. In problem domains like feature selection, the solutions are often constrained to the binary values viz. 0 and 1. With regard to this, in this paper, binary versions of MVO algorithm have been proposed with two prime aims: firstly, to remove redundant and irrelevant features from the dataset and secondly, to achieve better classification accuracy. The proposed binary versions use the concept of transformation functions for the mapping of a continuous version of the MVO algorithm to its binary versions. For carrying out the experiments, 21 diverse datasets have been used to compare the Binary MVO (BMVO) with some binary versions of existing metaheuristic algorithms. It has been observed that the proposed BMVO approaches have outperformed in terms of a number of features selected and the accuracy of the classification process

    Mutable composite firefly algorithm for gene selection in microarray based cancer classification

    Get PDF
    Cancer classification is critical due to the strenuous effort required in cancer treatment and the rising cancer mortality rate. Recent trends with high throughput technologies have led to discoveries in terms of biomarkers that successfully contributed to cancerrelated issues. A computational approach for gene selection based on microarray data analysis has been applied in many cancer classification problems. However, the existing hybrid approaches with metaheuristic optimization algorithms in feature selection (specifically in gene selection) are not generalized enough to efficiently classify most cancer microarray data while maintaining a small set of genes. This leads to the classification accuracy and genes subset size problem. Hence, this study proposed to modify the Firefly Algorithm (FA) along with the Correlation-based Feature Selection (CFS) filter for the gene selection task. An improved FA was proposed to overcome FA slow convergence by generating mutable size solutions for the firefly population. In addition, a composite position update strategy was designed for the mutable size solutions. The proposed strategy was to balance FA exploration and exploitation in order to address the local optima problem. The proposed hybrid algorithm known as CFS-Mutable Composite Firefly Algorithm (CFS-MCFA) was evaluated on cancer microarray data for biomarker selection along with the deployment of Support Vector Machine (SVM) as the classifier. Evaluation was performed based on two metrics: classification accuracy and size of feature set. The results showed that the CFS-MCFA-SVM algorithm outperforms benchmark methods in terms of classification accuracy and genes subset size. In particular, 100 percent accuracy was achieved on all four datasets and with only a few biomarkers (between one and four). This result indicates that the proposed algorithm is one of the competitive alternatives in feature selection, which later contributes to the analysis of microarray data

    Interpretability-oriented data-driven modelling of bladder cancer via computational intelligence

    Get PDF
    corecore