2,648 research outputs found

    Incremental Local Linear Fuzzy Classifier in Fisher Space

    Get PDF
    Optimizing the antecedent part of neurofuzzy system is an active research topic, for which different approaches have been developed. However, current approaches typically suffer from high computational complexity or lack of ability to extract knowledge from a given set of training data. In this paper, we introduce a novel incremental training algorithm for the class of neurofuzzy systems that are structured based on local linear classifiers. Linear discriminant analysis is utilized to transform the data into a space in which linear discriminancy of training samples is maximized. The neurofuzzy classifier is then built in the transformed space, starting from the simplest form (a global linear classifier). If the overall performance of the classifier was not satisfactory, it would be iteratively refined by incorporating additional local classifiers. In addition, rule consequent parameters are optimized using a local least square approach. Our refinement strategy is motivated by LOLIMOT, which is a greedy partition algorithm for structure training and has been successfully applied in a number of identification problems. The proposed classifier is compared to several benchmark classifiers on a number of well-known datasets. The results prove the efficacy of the proposed classifier in achieving high performance while incurring low computational effort

    Z-number-valued rule-based decision trees

    Get PDF
    As a novel architecture of a fuzzy decision tree constructed on fuzzy rules, the fuzzy rule-based decision tree (FRDT) achieved better performance in terms of both classification accuracy and the size of the resulted decision tree than other classical decision trees such as C4.5, LADtree, BFtree, SimpleCart and NBTree. The concept of Z-number extends the classical fuzzy number to model both uncertain and partial reliable information. Z-numbers have significant potential in rule-based systems due to their strong representation capability. This paper designs a Z-number-valued rulebased decision tree (ZRDT) and provides the learning algorithm. Firstly, the information gain is used to replace the fuzzy confidence in FRDT to select features in each rule. Additionally, we use the negative samples to generate the second fuzzy numbers that adjust the first fuzzy numbers and improve the model’s fit to the training data. The proposed ZRDT is compared with the FRDT with three different parameter values and two classical decision trees, PUBLIC and C4.5, and a decision tree ensemble method, AdaBoost.NC, in terms of classification effect and size of decision trees. Based on statistical tests, the proposed ZRDT has the highest classification performance with the smallest size for the produced decision tree.The project B-TIC-590-UGR20Programa Operativo FEDER 2014-2020Regional Ministry of EconomyKnowledgeEnterprise and Universities (CECEU) of AndalusiaChina Scholarship Council (CSC) (202106070037)Project PID2019-103880RB-I00MCIN/AEI/10.13039/501100011033Andalusian government through project P20_0067

    A survey of cost-sensitive decision tree induction algorithms

    Get PDF
    The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field

    Data Mining and Machine Learning in Astronomy

    Full text link
    We review the current state of data mining and machine learning in astronomy. 'Data Mining' can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black-box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those where data mining techniques directly resulted in improved science, and important current and future directions, including probability density functions, parallel algorithms, petascale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm, and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra figures, some minor additions to the tex

    Classification of Explainable Artificial Intelligence Methods through Their Output Formats

    Get PDF
    Machine and deep learning have proven their utility to generate data-driven models with high accuracy and precision. However, their non-linear, complex structures are often difficult to interpret. Consequently, many scholars have developed a plethora of methods to explain their functioning and the logic of their inferences. This systematic review aimed to organise these methods into a hierarchical classification system that builds upon and extends existing taxonomies by adding a significant dimension—the output formats. The reviewed scientific papers were retrieved by conducting an initial search on Google Scholar with the keywords “explainable artificial intelligence”; “explainable machine learning”; and “interpretable machine learning”. A subsequent iterative search was carried out by checking the bibliography of these articles. The addition of the dimension of the explanation format makes the proposed classification system a practical tool for scholars, supporting them to select the most suitable type of explanation format for the problem at hand. Given the wide variety of challenges faced by researchers, the existing XAI methods provide several solutions to meet the requirements that differ considerably between the users, problems and application fields of artificial intelligence (AI). The task of identifying the most appropriate explanation can be daunting, thus the need for a classification system that helps with the selection of methods. This work concludes by critically identifying the limitations of the formats of explanations and by providing recommendations and possible future research directions on how to build a more generally applicable XAI method. Future work should be flexible enough to meet the many requirements posed by the widespread use of AI in several fields, and the new regulation

    Projection pursuit random forest using discriminant feature analysis model for churners prediction in telecom industry

    Get PDF
    A major and demanding issue in the telecommunications industry is the prediction of churn customers. Churn describes the customer who is attrite from one Telecom service provider to competitors searching for better services offers. Companies from the Telco sector frequently have customer relationship management offices it is the main objective in how to win back defecting clients because preserve long-term customers can be much more beneficial to a company than gain newly recruited customers. Researchers and practitioners are paying great attention and investing more in developing a robust customer churn prediction model, especially in the telecommunication business by proposed numerous machine learning approaches. Many approaches of Classification are established, but the most effective in recent times is a tree-based method. The main contribution of this research is to predict churners/non-churners in the Telecom sector based on project pursuit Random Forest (PPForest) that uses discriminant feature analysis as a novelty extension of the conventional Random Forest approach for learning oblique Project Pursuit tree (PPtree). The proposed methodology leverages the advantage of two discriminant analysis methods to calculate the project index used in the construction of PPtree. The first method used Support Vector Machines (SVM) as a classifier in the construction of PPForest to differentiate between churners and non-churners customers. The second method is a Linear Discriminant Analysis (LDA) to achieve linear splitting of variables node during oblique PPtree construction to produce individual classifiers that are robust and more diverse than classical Random Forest. It found that the proposed methods enjoy the best performance measurements e.g. Accuracy, hit rate, ROC curve, Gini coefficient, Kolmogorov-Smirnov statistic and lift coefficient, H-measure, AUC. Moreover, PPForest based on direct applied of LDA on the raw data delivers an effective evaluator for the customer churn prediction model
    • …
    corecore