Search CORE

83 research outputs found

TE2Rules: Extracting Rule Lists from Tree Ensembles

Author: Chen Xiaotong
Lal G Roshan
Mithal Varun
Publication venue
Publication date: 01/11/2022
Field of study

Tree Ensemble (TE) models (e.g. Gradient Boosted Trees and Random Forests) often provide higher prediction performance compared to single decision trees. However, TE models generally lack transparency and interpretability, as humans have difficulty understanding their decision logic. This paper presents a novel approach to convert a TE trained for a binary classification task, to a rule list (RL) that is a global equivalent to the TE and is comprehensible for a human. This RL captures all necessary and sufficient conditions for decision making by the TE. Experiments on benchmark datasets demonstrate that, compared to state-of-the-art methods, (i) predictions from the RL generated by TE2Rules have high fidelity with respect to the original TE, (ii) the RL from TE2Rules has high interpretability measured by the number and the length of the decision rules, (iii) the run-time of TE2Rules algorithm can be reduced significantly at the cost of a slightly lower fidelity, and (iv) the RL is a fast alternative to the state-of-the-art rule-based instance-level outcome explanation techniques

arXiv.org e-Print Archive

A comparison among interpretative proposals for Random Forests

Author: Aria Massimo
Cuccurullo Corrado
Gnasso Agostino
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

The growing success of Machine Learning (ML) is making significant improvements to predictive models, facilitating their integration in various application fields. Despite its growing success, there are some limitations and disadvantages: the most significant is the lack of interpretability that does not allow users to understand how particular decisions are made. Our study focus on one of the best performing and most used models in the Machine Learning framework, the Random Forest model. It is known as an efficient model of ensemble learning, as it ensures high predictive precision, flexibility, and immediacy; it is recognized as an intuitive and understandable approach to the construction process, but it is also considered a Black Box model due to the large number of deep decision trees produced within it. The aim of this research is twofold. We present a survey about interpretative proposal for Random Forest and then we perform a machine learning experiment providing a comparison between two methodologies, inTrees, and NodeHarvest, that represent the main approaches in the rule extraction framework. The proposed experiment compares methods performance on six real datasets covering different data characteristics: n. of observations, balanced/unbalanced response, the presence of categorical and numerical predictors. This study contributes to picture a review of the methods and tools proposed for ensemble tree interpretation, and identify, in the class of rule extraction approaches, the best proposal

Archivio della ricerca - Università degli studi di Napoli Federico II

Directory of Open Access Journals

Open Access Repository

How can we explain Random Forests in a spatial framework?

Author: Golini Natalia
Publication venue: Pearson
Publication date: 01/01/2023
Field of study

Institutional Research Information System University of Turin

ASSOCIATION RULES IN RANDOM FOREST FOR THE MOST INTERPRETABLE MODEL

Author: Ilma Hafizah
Notodiputro Khairil Anwar
Sartono Bagus
Publication venue: 'Universitas Pattimura'
Publication date: 16/04/2023
Field of study

Random forest is one of the most popular ensemble methods and has many advantages. However, random forest is a "black-box" model, so the model is difficult to interpret. This study discusses the interpretation of random forest with association rules technique using rules extracted from each decision tree in the random forest model. This analysis involves simulation and empirical data, to determine the factors that affect the poverty status of households in Tasikmalaya. The empirical data was sourced from Badan Pusat Statistik (BPS), the National Socio-Economic Survey (SUSENAS) data for West Java Province in 2019.  The results obtained are based on simulation data, the association rules technique can extract the set of rules that characterize the target variable. The application of interpretable random forest to empirical data shows that the rules that most distinguish the poverty status of households in Tasikmalaya are house wall materials and the main source of drinking water, house wall materials and cooking fuel, as well as house wall materials and motorcycle ownership

OJS UNPATTI Publication Center (Universitas Pattimura)

Explaining Random Forest Predictions with Association Rules

Author: Boström Henrik
Gurung Ram B.
Johansson Ulf
Lindgren Tony
Publication venue
Publication date: 13/03/2020
Field of study

Random forests frequently achieve state-of-the-art predictive performance. However, the logic behind their predictions cannot be easily understood, since they are the result of averaging often hundreds or thousands of, possibly conflicting, individual predictions. Instead of presenting all the individual predictions, an alternative is proposed, by which the predictions are explained using association rules generated from itemsets representing paths in the trees of the forest. An empirical investigation is presented, in which alternative ways of generating the association rules are compared with respect to explainability, as measured by the fraction of predictions for which there is no applicable rule and by the fraction of predictions for which there is at least one applicable rule that conflicts with the forest prediction. For the considered datasets, it can be seen that most predictions can be explained by the discovered association rules, which have a high level of agreement with the underlying forest. The results do not single out a clear winner of the considered alternatives in terms of unexplained and disagreement rates, but show that they are associated with substantial differences in computational cost

KITopen

Transparent computational intelligence models for pharmaceutical tableting process

Author: Jachowicz Renata
Kazemi Pezhman
Khalid Mohammad Hassan
Mendyk Aleksander
Szlęk Jakub
Tuszyński Paweł K.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Purpose Pharmaceutical industry is tightly regulated owing to health concerns. Over the years, the use of computational intelligence (CI) tools has increased in pharmaceutical research and development, manufacturing, and quality control. Quality characteristics of tablets like tensile strength are important indicators of expected tablet performance. Predictive, yet transparent, CI models which can be analysed for insights into the formulation and development process. Methods This work uses data from a galenical tableting study and computational intelligence methods like decision trees, random forests, fuzzy systems, artificial neural networks, and symbolic regression to establish models for the outcome of tensile strength. Data was divided in training and test fold according to ten fold cross validation scheme and RMSE was used as an evaluation metric. Tree based ensembles and symbolic regression methods are presented as transparent models with extracted rules and mathematical formula, respectively, explaining the CI models in greater detail. Results CI models for tensile strength of tablets based on the formulation design and process parameters have been established. Best models exhibit normalized RMSE of 7 %. Rules from fuzzy systems and random forests are shown to increase transparency of CI models. A mathematical formula generated by symbolic regression is presented as a transparent model. Conclusions CI models explain the variation of tensile strength according to formulation and manufacturing process characteristics. CI models can be further analyzed to extract actionable knowledge making the artificial learning process more transparent and acceptable for use in pharmaceutical quality and safety domains

Springer - Publisher Connector

Jagiellonian Univeristy Repository

A Survey Of Methods For Explaining Black Box Models

Author: Giannotti Fosca
Guidotti Riccardo
Monreale Anna
Pedreschi Dino
Ruggieri Salvatore
Turini Franco
Publication venue
Publication date: 01/01/2018
Field of study

In the last years many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness sometimes at the cost of scarifying accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, delineating explicitly or implicitly its own definition of interpretability and explanation. The aim of this paper is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.Comment: This work is currently under review on an international journa

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Archivio della Ricerca - Università di Pisa