356,492 research outputs found

    Using multiple classifiers for predicting the risk of endovascular aortic aneurysm repair re-intervention through hybrid feature selection.

    Get PDF
    Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox's proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox's model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients' future follow-up plan

    Improving Feature Selection Techniques for Machine Learning

    Get PDF
    As a commonly used technique in data preprocessing for machine learning, feature selection identifies important features and removes irrelevant, redundant or noise features to reduce the dimensionality of feature space. It improves efficiency, accuracy and comprehensibility of the models built by learning algorithms. Feature selection techniques have been widely employed in a variety of applications, such as genomic analysis, information retrieval, and text categorization. Researchers have introduced many feature selection algorithms with different selection criteria. However, it has been discovered that no single criterion is best for all applications. We proposed a hybrid feature selection framework called based on genetic algorithms (GAs) that employs a target learning algorithm to evaluate features, a wrapper method. We call it hybrid genetic feature selection (HGFS) framework. The advantages of this approach include the ability to accommodate multiple feature selection criteria and find small subsets of features that perform well for the target algorithm. The experiments on genomic data demonstrate that ours is a robust and effective approach that can find subsets of features with higher classification accuracy and/or smaller size compared to each individual feature selection algorithm. A common characteristic of text categorization tasks is multi-label classification with a great number of features, which makes wrapper methods time-consuming and impractical. We proposed a simple filter (non-wrapper) approach called Relation Strength and Frequency Variance (RSFV) measure. The basic idea is that informative features are those that are highly correlated with the class and distribute most differently among all classes. The approach is compared with two well-known feature selection methods in the experiments on two standard text corpora. The experiments show that RSFV generate equal or better performance than the others in many cases

    Hybrid Genetic Algorithm for Medical Image Feature Extraction and Selection

    Get PDF
    AbstractFor a hybrid medical image retrieval system, a genetic algorithm (GA) approach is presented for the selection of dimensionality reduced set of features. This system was developed in three phases. In first phase, three distinct algorithm are used to extract the vital features from the images. The algorithm devised for the extraction of the features are Texton based contour gradient extraction algorithm, Intrinsic pattern extraction algorithm and modified shift invariant feature transformation algorithm. In the second phase to identify the potential feature vector GA based feature selection is done, using a hybrid approach of “Branch and Bound Algorithm” and “Artificial Bee Colony Algorithm” using the breast cancer, Brain tumour and thyroid images. The Chi Square distance measurement is used to assess the similarity between query images and database images. A fitness function with respect Minimum description length principle were used as initial requirement for genetic algorithm. In the third phase to improve the performance of the hybrid content based medical image retrieval system diverse density based relevance feedback method is used. The term hybrid is used as this system can be used to retrieve any kind of medical image such as breast cancer, brain tumour, lung cancer, thyroid cancer and so on. This machine learning based feature selection method is used to reduce the existing system dimensionality problem. The experimental result shows that the GA driven image retrieval system selects optimal subset of feature to identify the right set of images

    AutoSVD++: An Efficient Hybrid Collaborative Filtering Model via Contractive Auto-encoders

    Full text link
    Collaborative filtering (CF) has been successfully used to provide users with personalized products and services. However, dealing with the increasing sparseness of user-item matrix still remains a challenge. To tackle such issue, hybrid CF such as combining with content based filtering and leveraging side information of users and items has been extensively studied to enhance performance. However, most of these approaches depend on hand-crafted feature engineering, which are usually noise-prone and biased by different feature extraction and selection schemes. In this paper, we propose a new hybrid model by generalizing contractive auto-encoder paradigm into matrix factorization framework with good scalability and computational efficiency, which jointly model content information as representations of effectiveness and compactness, and leverage implicit user feedback to make accurate recommendations. Extensive experiments conducted over three large scale real datasets indicate the proposed approach outperforms the compared methods for item recommendation.Comment: 4 pages, 3 figure

    Adaptive Data Mining Approach for Pcb Defect Detection and Classification

    Get PDF
    Objective: To develop a model for PCB defect detection and classification with the help of soft computing technique. Methodology: To improve the performance of the prediction and classification we propose a hybrid approach for feature reduction and classification. The proposed approach is divided into three main stages: (i) data pre-processing (ii) feature selection and reduction and (iii) Classification. In this approach, pre-processing, feature selection and reduction is carried out by measuring of confidence with the adaptive genetic algorithm. Prediction and classification is carried out by using neural network classifier. A genetic algorithm is used for data preprocessing to achieve the feature reduction and confidence measurement. Findings: The system is implemented using MatLab 2013b. The resulting analysis shows that the proposed approach is capable of detecting and classifying defects in PCB board

    An embedded two-layer feature selection approach for microarray data analysis

    Full text link
    Feature selection is an important technique in dealing with application problems with large number of variables and limited training samples, such as image processing, combinatorial chemistry, and microarray analysis. Commonly employed feature selection strategies can be divided into filter and wrapper. In this study, we propose an embedded two-layer feature selection approach to combining the advantages of filter and wrapper algorithms while avoiding their drawbacks. The hybrid algorithm, called GAEF (Genetic Algorithm with embedded filter), divides the feature selection process into two stages. In the first stage, Genetic Algorithm (GA) is employed to pre-select features while in the second stage a filter selector is used to further identify a small feature subset for accurate sample classification. Three benchmark microarray datasets are used to evaluate the proposed algorithm. The experimental results suggest that this embedded two-layer feature selection strategy is able to improve the stability of the selection results as well as the sample classification accuracy.<br /

    Breast cancer diagnosis using a hybrid genetic algorithm for feature selection based on mutual information

    Get PDF
    Feature Selection is the process of selecting a subset of relevant features (i.e. predictors) for use in the construction of predictive models. This paper proposes a hybrid feature selection approach to breast cancer diagnosis which combines a Genetic Algorithm (GA) with Mutual Information (MI) for selecting the best combination of cancer predictors, with maximal discriminative capability. The selected features are then input into a classifier to predict whether a patient has breast cancer. Using a publicly available breast cancer dataset, experiments were performed to evaluate the performance of the Genetic Algorithm based on the Mutual Information approach with two different machine learning classifiers, namely the k-Nearest Neighbor (KNN), and Support vector machine (SVM), each tuned using different distance measures and kernel functions, respectively. The results revealed that the proposed hybrid approach is highly accurate for predicting breast cancer, and it is very promising for predicting other cancers using clinical data

    A Novel Hybrid Feature Selection Method for Day-Ahead Electricity Price Forecasting

    Get PDF
    The paper proposes a novel hybrid feature selection (FS) method for day-ahead electricity price forecasting. The work presents a novel hybrid FS algorithm for obtaining optimal feature set to gain optimal forecast accuracy. The performance of the proposed forecaster is compared with forecasters based on classification tree and regression tree. A hybrid FS method based on the elitist genetic algorithm (GA) and a tree-based method is applied for FS. Making use of selected features, aperformance test of the forecaster was carried out to establish the usefulness of the proposed approach. By way of analyzing and forecasts for day-ahead electricity prices in the Australian electricity markets, the proposed approach is evaluated and it has been established that, with the selected feature, the proposed forecaster consistently outperforms the forecaster with a larger feature set. The proposed method is simulated in MATLAB and WEKA software.publishedVersio

    A hybrid swarm intelligence feature selection approach based on time-varying transition parameter

    Get PDF
    Feature selection aims to reduce the dimensionality of a dataset by removing superfluous attributes. This paper proposes a hybrid approach for feature selection problem by combining particle swarm optimization (PSO), grey wolf optimization (GWO), and tournament selection (TS) mechanism. Particle swarm enhances the diversification at the beginning of the search mechanism, grey wolf enhances the intensification at the end of the search mechanism, while tournament selection maintains diversification not only at the beginning but also at the end of the search process to achieve local optima avoidance. A time-varying transition parameter and a random variable are used to select either particle swarm, grey wolf, or tournament selection techniques during search process. This paper proposes different variants of this approach based on S-shaped and V-shaped transfer functions (TFs) to convert continuous solutions to binaries. These variants are named hybrid tournament grey wolf particle swarm (HTGWPS), followed by S or V letter to indicate the TF type, and followed by the TF’s number. These variants were evaluated using nine high-dimensional datasets. The results revealed that HTGWPS-V1 outperformed other V’s variants, PSO, and GWO on 78% of the datasets based on maximum classification accuracy obtained by a minimal feature subset. Also, HTGWPS-V1 outperformed six well-known-metaheuristics on 67% of the datasets
    corecore