92,091 research outputs found

    Design an Optimal Decision Tree based Algorithm to Improve Model Prediction Performance

    Get PDF
    Performance of decision trees is assessed by prediction accuracy for unobserved occurrences. In order to generate optimised decision trees with high classification accuracy and smaller decision trees, this study will pre-process the data. In this study, some decision tree components are addressed and enhanced. The algorithms should produce precise and ideal decision trees in order to increase prediction performance. Additionally, it hopes to create a decision tree algorithm with a tiny global footprint and excellent forecast accuracy. The typical decision tree-based technique was created for classification purposes and is used with various kinds of uncertain information. Prior to preparing the dataset for classification, the uncertain dataset was first processed through missing data treatment and other uncertainty handling procedures to produce the balanced dataset. Three different real-time datasets, including the Titanic dataset, the PIMA Indian Diabetes dataset, and datasets relating to heart disease, have been used to test the proposed algorithm. The suggested algorithm's performance has been assessed in terms of the precision, recall, f-measure, and accuracy metrics. The outcomes of suggested decision tree and the standard decision tree have been contrasted. On all three datasets, it was found that the decision tree with Gini impurity optimization performed remarkably well

    Optimized Anomaly based Risk Reduction using PCA based Genetic Classifier

    Get PDF
    Security risk analysis is the thrust area for the information based world The researchers in this field deployed numerous techniques to overcome the information security oriented problem In this paper the researcher tried for a approach of using anomaly detection for the risk reduction The hub initiative for this work is that the anomalies are the deviation which could increase the percentage of risk The anomaly detection is guided by the PCA and the genetic based multi class classifier is used The classification is induced by the decision tree approach were the genetic algorithm is set out for the optimization in the process of finding the nodes of the tree The proposed approach is evaluated with the bench mark on PCA based ANN classifier The proposed approach outperforms the existing one The results are demonstrate

    KOMPARASI ALGORITMA C4.5 DAN C4.5 BERBASIS PSO UNTUK PREDIKSI JUMLAH PENGGUNAAN BBM PERBULAN PADA KANTOR DINAS LINGKUNGAN HIDUP DAN KEBERSIHAN KABUPATEN LOMBOK TIMUR

    Get PDF
    East Lombok Regency is one of the second level regions in West Nusa Tenggara Province which is located on the east side of Lombok Island. The capital city of East Lombok Regency is the city of Selong, where all government agencies are based in this city. One of them is the Department of Environment and Hygiene of East Lombok Regency. In carrying out operational duties at the Office of Environment and Hygiene the operational vehicle requires that the fuel oil is a subsidy from the government. Therefore, the use of BBM every day must be recorded properly so that it can be predicted the amount of fuel usage every month. However, the Office of the Environment and Hygiene Office has difficulty in processing such data in large quantities. Predicted information on fuel use is needed by the head of the agency to assist in making decisions or policies. Of these problems the right data mining technique to use is classification. One method of classification of data mining is the decition tree algorithm (C4.5) or called the decision tree. The decition tree (C4.5) algorithm has weaknesses in reading large amounts of data, so researchers use weighting by applying Particle Swarm Optimization (PSO) for attribute selection to increase the accuracy of C4.5.Thus the researcher will utilize data mining software in applying a comparison of the decition tree (C4.5) and C4.5 algorithms based on Particle Swarm Optimization (PSO) to get the best accuracy value in predicting the amount of monthly use of fuel oil at the Service Office Environment and Cleanliness of East Lombok Regency.DOI : 10.29408/jit.v2i1.117

    Data fusion by using machine learning and computational intelligence techniques for medical image analysis and classification

    Get PDF
    Data fusion is the process of integrating information from multiple sources to produce specific, comprehensive, unified data about an entity. Data fusion is categorized as low level, feature level and decision level. This research is focused on both investigating and developing feature- and decision-level data fusion for automated image analysis and classification. The common procedure for solving these problems can be described as: 1) process image for region of interest\u27 detection, 2) extract features from the region of interest and 3) create learning model based on the feature data. Image processing techniques were performed using edge detection, a histogram threshold and a color drop algorithm to determine the region of interest. The extracted features were low-level features, including textual, color and symmetrical features. For image analysis and classification, feature- and decision-level data fusion techniques are investigated for model learning using and integrating computational intelligence and machine learning techniques. These techniques include artificial neural networks, evolutionary algorithms, particle swarm optimization, decision tree, clustering algorithms, fuzzy logic inference, and voting algorithms. This work presents both the investigation and development of data fusion techniques for the application areas of dermoscopy skin lesion discrimination, content-based image retrieval, and graphic image type classification --Abstract, page v

    Learning Multi-Tree Classification Models with Ant Colony Optimization

    Get PDF
    Ant Colony Optimization (ACO) is a meta-heuristic for solving combinatorial optimization problems, inspired by the behaviour of biological ant colonies. One of the successful applications of ACO is learning classification models (classifiers). A classifier encodes the relationships between the input attribute values and the values of a class attribute in a given set of labelled cases and it can be used to predict the class value of new unlabelled cases. Decision trees have been widely used as a type of classification model that represent comprehensible knowledge to the user. In this paper, we propose the use of ACO-based algorithms for learning an extended multi-tree classification model, which consists of multiple decision trees, one for each class value. Each class-based decision trees is responsible for discriminating between its class value and all other values available in the class domain. Our proposed algorithms are empirically evaluated against well-known decision trees induction algorithms, as well as the ACO-based Ant-Tree-Miner algorithm. The results show an overall improvement in predictive accuracy over 32 benchmark datasets. We also discuss how the new multi-tree models can provide the user with more understanding and knowledge-interpretability in a given domain

    Investigating Evaluation Measures in Ant Colony Algorithms for Learning Decision Tree Classifiers

    Get PDF
    Ant-Tree-Miner is a decision tree induction algorithm that is based on the Ant Colony Optimization (ACO) meta- heuristic. Ant-Tree-Miner-M is a recently introduced extension of Ant-Tree-Miner that learns multi-tree classification models. A multi-tree model consists of multiple decision trees, one for each class value, where each class-based decision tree is responsible for discriminating between its class value and all other values present in the class domain (one vs. all). In this paper, we investigate the use of 10 different classification quality evaluation measures in Ant-Tree-Miner-M, which are used for both candidate model evaluation and model pruning. Our experimental results, using 40 popular benchmark datasets, identify several quality functions that substantially improve on the simple Accuracy quality function that was previously used in Ant-Tree-Miner-M

    Application of decision trees and multivariate regression trees in design and optimization

    Get PDF
    Induction of decision trees and regression trees is a powerful technique not only for performing ordinary classification and regression analysis but also for discovering the often complex knowledge which describes the input-output behavior of a learning system in qualitative forms;In the area of classification (discrimination analysis), a new technique called IDea is presented for performing incremental learning with decision trees. It is demonstrated that IDea\u27s incremental learning can greatly reduce the spatial complexity of a given set of training examples. Furthermore, it is shown that this reduction in complexity can also be used as an effective tool for improving the learning efficiency of other types of inductive learners such as standard backpropagation neural networks;In the area of regression analysis, a new methodology for performing multiobjective optimization has been developed. Specifically, we demonstrate that muitiple-objective optimization through induction of multivariate regression trees is a powerful alternative to the conventional vector optimization techniques. Furthermore, in an attempt to investigate the effect of various types of splitting rules on the overall performance of the optimizing system, we present a tree partitioning algorithm which utilizes a number of techniques derived from diverse fields of statistics and fuzzy logic. These include: two multivariate statistical approaches based on dispersion matrices, an information-theoretic measure of covariance complexity which is typically used for obtaining multivariate linear models, two newly-formulated fuzzy splitting rules based on Pearson\u27s parametric and Kendall\u27s nonparametric measures of association, Bellman and Zadeh\u27s fuzzy decision-maximizing approach within an inductive framework, and finally, the multidimensional extension of a widely-used fuzzy entropy measure. The advantages of this new approach to optimization are highlighted by presenting three examples which respectively deal with design of a three-bar truss, a beam, and an electric discharge machining (EDM) process
    • …
    corecore