2,648 research outputs found
Incremental Local Linear Fuzzy Classifier in Fisher Space
Optimizing the antecedent part of neurofuzzy system is an active research topic, for which different approaches have been developed. However, current approaches typically suffer from high computational complexity or lack of ability to extract knowledge from a given set of training data. In this paper, we introduce a novel incremental training algorithm for the class of neurofuzzy systems that are structured based on local linear classifiers. Linear discriminant analysis is utilized to transform the data into a space in which linear discriminancy of training samples is maximized. The neurofuzzy classifier is then built in the transformed space, starting from the simplest form (a global linear classifier). If the overall performance of the classifier was not satisfactory, it would be iteratively refined by incorporating additional local classifiers. In addition, rule consequent parameters are optimized using a local least square approach. Our refinement strategy is motivated by LOLIMOT, which is a greedy partition algorithm for structure training and has been successfully applied in a number of identification problems. The proposed classifier is compared to several benchmark classifiers on a number of well-known datasets. The results prove the efficacy of the proposed classifier in achieving high performance while incurring low computational effort
Z-number-valued rule-based decision trees
As a novel architecture of a fuzzy decision tree constructed on fuzzy rules, the fuzzy rule-based
decision tree (FRDT) achieved better performance in terms of both classification accuracy and the
size of the resulted decision tree than other classical decision trees such as C4.5, LADtree, BFtree,
SimpleCart and NBTree. The concept of Z-number extends the classical fuzzy number to model
both uncertain and partial reliable information. Z-numbers have significant potential in rule-based
systems due to their strong representation capability. This paper designs a Z-number-valued rulebased
decision tree (ZRDT) and provides the learning algorithm. Firstly, the information gain is
used to replace the fuzzy confidence in FRDT to select features in each rule. Additionally, we use
the negative samples to generate the second fuzzy numbers that adjust the first fuzzy numbers
and improve the model’s fit to the training data. The proposed ZRDT is compared with the FRDT
with three different parameter values and two classical decision trees, PUBLIC and C4.5, and a
decision tree ensemble method, AdaBoost.NC, in terms of classification effect and size of decision
trees. Based on statistical tests, the proposed ZRDT has the highest classification performance
with the smallest size for the produced decision tree.The project B-TIC-590-UGR20Programa Operativo FEDER 2014-2020Regional Ministry of EconomyKnowledgeEnterprise and Universities (CECEU) of AndalusiaChina Scholarship Council (CSC)
(202106070037)Project PID2019-103880RB-I00MCIN/AEI/10.13039/501100011033Andalusian
government through project P20_0067
A survey of cost-sensitive decision tree induction algorithms
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy.
'Data Mining' can have a somewhat mixed connotation from the point of view of a
researcher in this field. If used correctly, it can be a powerful approach,
holding the potential to fully exploit the exponentially increasing amount of
available data, promising great scientific advance. However, if misused, it can
be little more than the black-box application of complex computing algorithms
that may give little physical insight, and provide questionable results. Here,
we give an overview of the entire data mining process, from data collection
through to the interpretation of results. We cover common machine learning
algorithms, such as artificial neural networks and support vector machines,
applications from a broad range of astronomy, emphasizing those where data
mining techniques directly resulted in improved science, and important current
and future directions, including probability density functions, parallel
algorithms, petascale computing, and the time domain. We conclude that, so long
as one carefully selects an appropriate algorithm, and is guided by the
astronomical problem at hand, data mining can be very much the powerful tool,
and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra
figures, some minor additions to the tex
Classification of Explainable Artificial Intelligence Methods through Their Output Formats
Machine and deep learning have proven their utility to generate data-driven models with high accuracy and precision. However, their non-linear, complex structures are often difficult to interpret. Consequently, many scholars have developed a plethora of methods to explain their functioning and the logic of their inferences. This systematic review aimed to organise these methods into a hierarchical classification system that builds upon and extends existing taxonomies by adding a significant dimension—the output formats. The reviewed scientific papers were retrieved by conducting an initial search on Google Scholar with the keywords “explainable artificial intelligence”; “explainable machine learning”; and “interpretable machine learning”. A subsequent iterative search was carried out by checking the bibliography of these articles. The addition of the dimension of the explanation format makes the proposed classification system a practical tool for scholars, supporting them to select the most suitable type of explanation format for the problem at hand. Given the wide variety of challenges faced by researchers, the existing XAI methods provide several solutions to meet the requirements that differ considerably between the users, problems and application fields of artificial intelligence (AI). The task of identifying the most appropriate explanation can be daunting, thus the need for a classification system that helps with the selection of methods. This work concludes by critically identifying the limitations of the formats of explanations and by providing recommendations and possible future research directions on how to build a more generally applicable XAI method. Future work should be flexible enough to meet the many requirements posed by the widespread use of AI in several fields, and the new regulation
Projection pursuit random forest using discriminant feature analysis model for churners prediction in telecom industry
A major and demanding issue in the telecommunications industry is the prediction of churn customers. Churn describes the customer who is attrite from one Telecom service provider to competitors searching for better services offers. Companies from the Telco sector frequently have customer relationship management offices it is the main objective in how to win back defecting clients because preserve long-term customers can be much more beneficial to a company than gain newly recruited customers. Researchers and practitioners are paying great attention and investing more in developing a robust customer churn prediction model, especially in the telecommunication business by proposed numerous machine learning approaches. Many approaches of Classification are established, but the most effective in recent times is a tree-based method. The main contribution of this research is to predict churners/non-churners in the Telecom sector based on project pursuit Random Forest (PPForest) that uses discriminant feature analysis as a novelty extension of the conventional Random Forest approach for learning oblique Project Pursuit tree (PPtree). The proposed methodology leverages the advantage of two discriminant analysis methods to calculate the project index used in the construction of PPtree. The first method used Support Vector Machines (SVM) as a classifier in the construction of PPForest to differentiate between churners and non-churners customers. The second method is a Linear Discriminant Analysis (LDA) to achieve linear splitting of variables node during oblique PPtree construction to produce individual classifiers that are robust and more diverse than classical Random Forest. It found that the proposed methods enjoy the best performance measurements e.g. Accuracy, hit rate, ROC curve, Gini coefficient, Kolmogorov-Smirnov statistic and lift coefficient, H-measure, AUC. Moreover, PPForest based on direct applied of LDA on the raw data delivers an effective evaluator for the customer churn prediction model
- …