299,107 research outputs found

    Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

    Get PDF
    Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effective features, using an extended wrapper method, ensemble classification is performed. The extended feature selection approach includes a prior feature filtering and a wrapper approach using C4.5 decision tree. Ensemble classification, using cost sensitive decision trees is performed in a decision forest framework. A locally gathered fraud detection dataset is used to estimate the proposed method. The proposed method is assessed using accuracy, recall, and F-measure as evaluation metrics and compared with basic classification algorithms including ID3, J48, Naïve Bayes, Bayesian Network and NB tree. Experiments show that considering the F-measure as evaluation metric, the proposed approach yields 1.8 to 2.4 percent performance improvement compared to other classifiers

    CSNL: A cost-sensitive non-linear decision tree algorithm

    Get PDF
    This article presents a new decision tree learning algorithm called CSNL that induces Cost-Sensitive Non-Linear decision trees. The algorithm is based on the hypothesis that nonlinear decision nodes provide a better basis than axis-parallel decision nodes and utilizes discriminant analysis to construct nonlinear decision trees that take account of costs of misclassification. The performance of the algorithm is evaluated by applying it to seventeen datasets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the datasets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using nonlinear decision nodes. The performance of the algorithm is evaluated by applying it to seventeen data sets and the results are compared with those obtained by two well known cost-sensitive algorithms, ICET and MetaCost, which generate multiple trees to obtain some of the best results to date. The results show that CSNL performs at least as well, if not better than these algorithms, in more than twelve of the data sets and is considerably faster. The use of bagging with CSNL further enhances its performance showing the significant benefits of using non-linear decision nodes
    • …
    corecore