61,131 research outputs found

    Interval-valued fuzzy decision trees with optimal neighbourhood perimeter

    Get PDF
    This research proposes a new model for constructing decision trees using interval-valued fuzzy membership values. Most existing fuzzy decision trees do not consider the uncertainty associated with their membership values, however, precise values of fuzzy membership values are not always possible. In this paper, we represent fuzzy membership values as intervals to model uncertainty and employ the look-ahead based fuzzy decision tree induction method to construct decision trees. We also investigate the significance of different neighbourhood values and define a new parameter insensitive to specific data sets using fuzzy sets. Some examples are provided to demonstrate the effectiveness of the approach

    Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms

    Get PDF
    Machine learning approaches have been successfully applied to many classification and prediction problems. One of the most popular machine learning approaches is decision trees. A main advantage of decision trees is the clarity of the decision model they produce. The ID3 algorithm proposed by Quinlan forms the basis for many of the decision trees’ application. Trees produced by ID3 are sensitive to small perturbations in training data. To overcome this problem and to handle data uncertainties and spurious precision in data, fuzzy ID3 integrated fuzzy set theory and ideas from fuzzy logic with ID3. Several fuzzy decision trees algorithms and tools exist. However, existing tools are slow, produce a large number of rules and/or lack the support for automatic fuzzification of input data. These limitations make those tools unsuitable for a variety of applications including those with many features and real time ones such as intrusion detection. In addition, the large number of rules produced by these tools renders the generated decision model un-interpretable. In this research work, we proposed an improved version of the fuzzy ID3 algorithm. We also introduced a new method for reducing the number of fuzzy rules generated by Fuzzy ID3. In addition we applied fuzzy decision trees to the classification of real and pseudo microRNA precursors. Our experimental results showed that our improved fuzzy ID3 can achieve better classification accuracy and is more efficient than the original fuzzy ID3 algorithm, and that fuzzy decision trees can outperform several existing machine learning algorithms on a wide variety of datasets. In addition our experiments showed that our developed fuzzy rule reduction method resulted in a significant reduction in the number of produced rules, consequently, improving the produced decision model comprehensibility and reducing the fuzzy decision tree execution time. This reduction in the number of rules was accompanied with a slight improvement in the classification accuracy of the resulting fuzzy decision tree. In addition, when applied to the microRNA prediction problem, fuzzy decision tree achieved better results than other machine learning approaches applied to the same problem including Random Forest, C4.5, SVM and Knn

    A new approach to fuzzy random forest generation

    Get PDF
    Random forests have proved to be very effective classifiers, which can achieve very high accuracies. Although a number of papers have discussed the use of fuzzy sets for coping with uncertain data in decision tree learning, fuzzy random forests have not been particularly investigated in the fuzzy community. In this paper, we first propose a simple method for generating fuzzy decision trees by creating fuzzy partitions for continuous variables during the learning phase. Then, we discuss how the method can be used for generating forests of fuzzy decision trees. Finally, we show how these fuzzy random forests achieve accuracies higher than two fuzzy rule-based classifiers recently proposed in the literature. Also, we highlight how fuzzy random forests are more tolerant to noise in datasets than classical crisp random forests

    Fuzzy Decision Tree-based Inference System for Liver Disease Diagnosis

    Get PDF
    Medical diagnosis can be challenging because of a number of factors. Uncertainty in the diagnosis process arises from inaccuracy in the measurement of patient attributes, missing attribute data and limitation in the medical expert’s ability to define cause and effect relationships when there are multiple interrelated variables. Given this situation, a decision support system, which can help doctors come up with a more reliable diagnosis, can have a lot of potential. Decision trees are used in data mining for classification and regression. They are simple to understand and interpret as they can be visualized. But, one of the disadvantages of decision tree algorithms is that they deal with only crisp or exact values for data. Fuzzy logic is described as logic that is used to describe and formalize fuzzy or inexact information and perform reasoning using such information. Although both decision trees and fuzzy rule-based systems have been used for medical diagnosis, there have been few attempts to use fuzzy decision trees in combination with fuzzy rules. This study explored the application of fuzzy logic to help diagnose liver diseases based on blood test results. In this project, inference systems aimed at classifying patient data using a fuzzy decision tree and a fuzzy rule-based system were designed and implemented. Fuzzy decision tree was used to generate rules that formed the rule-base for the diagnostic inference system. Results from this study indicate that for the specific patient data set used in this experiment, the fuzzy decision tree-based inferencing out performed both the crisp decision tree and the fuzzy rule-based inferencing in classification accuracy

    Z-number-valued rule-based decision trees

    Get PDF
    As a novel architecture of a fuzzy decision tree constructed on fuzzy rules, the fuzzy rule-based decision tree (FRDT) achieved better performance in terms of both classification accuracy and the size of the resulted decision tree than other classical decision trees such as C4.5, LADtree, BFtree, SimpleCart and NBTree. The concept of Z-number extends the classical fuzzy number to model both uncertain and partial reliable information. Z-numbers have significant potential in rule-based systems due to their strong representation capability. This paper designs a Z-number-valued rulebased decision tree (ZRDT) and provides the learning algorithm. Firstly, the information gain is used to replace the fuzzy confidence in FRDT to select features in each rule. Additionally, we use the negative samples to generate the second fuzzy numbers that adjust the first fuzzy numbers and improve the model’s fit to the training data. The proposed ZRDT is compared with the FRDT with three different parameter values and two classical decision trees, PUBLIC and C4.5, and a decision tree ensemble method, AdaBoost.NC, in terms of classification effect and size of decision trees. Based on statistical tests, the proposed ZRDT has the highest classification performance with the smallest size for the produced decision tree.The project B-TIC-590-UGR20Programa Operativo FEDER 2014-2020Regional Ministry of EconomyKnowledgeEnterprise and Universities (CECEU) of AndalusiaChina Scholarship Council (CSC) (202106070037)Project PID2019-103880RB-I00MCIN/AEI/10.13039/501100011033Andalusian government through project P20_0067

    Fuzzy-Rough Feature Significance for Fuzzy Decision Trees

    Get PDF
    Crisp decision trees are one of the most popular classification algorithms in current use within data mining and machine learning. However, although they possess many desirable features, they lack the ability to model vagueness. As a result of this, the induction of fuzzy decision trees (FDTs) has become an area of much interest. One important aspect of tree induction is the choice of feature at each stage of construction. If weak features are selected, the resulting decision tree will be meaningless and will exhibit poor performance. This paper introduces a new measure of feature significance based on fuzzy-rough sets for use within fuzzy ID3. The measure is experimentally compared with leading feature rankers, and is also compared with traditional fuzzy entropy for fuzzy tree induction.

    On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems

    Full text link
    We present a new distributed fuzzy partitioning method to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems. The proposed algorithm builds a fixed number of fuzzy sets for all variables and adjusts their shape and position to the real distribution of training data. A two-step process is applied : 1) transformation of the original distribution into a standard uniform distribution by means of the probability integral transform. Since the original distribution is generally unknown, the cumulative distribution function is approximated by computing the q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy partition in the transformed attribute space using a fixed number of equally distributed triangular membership functions. Despite the aforementioned transformation, the definition of every fuzzy set in the original space can be recovered by applying the inverse cumulative distribution function (also known as quantile function). The experimental results reveal that the proposed methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT) induction algorithm to maintain classification accuracy with up to 6 million fewer leaves.Comment: Appeared in 2018 IEEE International Congress on Big Data (BigData Congress). arXiv admin note: text overlap with arXiv:1902.0935

    On Fuzzy Concepts

    Get PDF
    In this paper we try to combine two approaches. One is the theory of knowledge graphs in which concepts are represented by graphs. The other is the axiomatic theory of fuzzy sets (AFS). The discussion will focus on the idea of fuzzy concept. It will be argued that the fuzziness of a concept in natural language is mainly due to the difference in interpretation that people give to a certain word. As different interpretations lead to different knowledge graphs, the notion of fuzzy concept should be describable in terms of sets of graphs. This leads to a natural introduction of membership values for elements of graphs. Using these membership values we apply AFS theory as well as an alternative approach to calculate fuzzy decision trees, that can be used to determine the most relevant elements of a concept
    • …
    corecore