18 research outputs found
On Tackling Explanation Redundancy in Decision Trees
Decision trees (DTs) epitomize the ideal of interpretability of machine
learning (ML) models. The interpretability of decision trees motivates
explainability approaches by so-called intrinsic interpretability, and it is at
the core of recent proposals for applying interpretable ML models in high-risk
applications. The belief in DT interpretability is justified by the fact that
explanations for DT predictions are generally expected to be succinct. Indeed,
in the case of DTs, explanations correspond to DT paths. Since decision trees
are ideally shallow, and so paths contain far fewer features than the total
number of features, explanations in DTs are expected to be succinct, and hence
interpretable. This paper offers both theoretical and experimental arguments
demonstrating that, as long as interpretability of decision trees equates with
succinctness of explanations, then decision trees ought not be deemed
interpretable. The paper introduces logically rigorous path explanations and
path explanation redundancy, and proves that there exist functions for which
decision trees must exhibit paths with arbitrarily large explanation
redundancy. The paper also proves that only a very restricted class of
functions can be represented with DTs that exhibit no explanation redundancy.
In addition, the paper includes experimental results substantiating that path
explanation redundancy is observed ubiquitously in decision trees, including
those obtained using different tree learning algorithms, but also in a wide
range of publicly available decision trees. The paper also proposes
polynomial-time algorithms for eliminating path explanation redundancy, which
in practice require negligible time to compute. Thus, these algorithms serve to
indirectly attain irreducible, and so succinct, explanations for decision
trees
Delivering Inflated Explanations
In the quest for Explainable Artificial Intelligence (XAI) one of the
questions that frequently arises given a decision made by an AI system is,
``why was the decision made in this way?'' Formal approaches to explainability
build a formal model of the AI system and use this to reason about the
properties of the system. Given a set of feature values for an instance to be
explained, and a resulting decision, a formal abductive explanation is a set of
features, such that if they take the given value will always lead to the same
decision. This explanation is useful, it shows that only some features were
used in making the final decision. But it is narrow, it only shows that if the
selected features take their given values the decision is unchanged. It's
possible that some features may change values and still lead to the same
decision. In this paper we formally define inflated explanations which is a set
of features, and for each feature of set of values (always including the value
of the instance being explained), such that the decision will remain unchanged.
Inflated explanations are more informative than abductive explanations since
e.g they allow us to see if the exact value of a feature is important, or it
could be any nearby value. Overall they allow us to better understand the role
of each feature in the decision. We show that we can compute inflated
explanations for not that much greater cost than abductive explanations, and
that we can extend duality results for abductive explanations also to inflated
explanations
On Computing Probabilistic Abductive Explanations
The most widely studied explainable AI (XAI) approaches are unsound. This is
the case with well-known model-agnostic explanation approaches, and it is also
the case with approaches based on saliency maps. One solution is to consider
intrinsic interpretability, which does not exhibit the drawback of unsoundness.
Unfortunately, intrinsic interpretability can display unwieldy explanation
redundancy. Formal explainability represents the alternative to these
non-rigorous approaches, with one example being PI-explanations. Unfortunately,
PI-explanations also exhibit important drawbacks, the most visible of which is
arguably their size. Recently, it has been observed that the (absolute) rigor
of PI-explanations can be traded off for a smaller explanation size, by
computing the so-called relevant sets. Given some positive {\delta}, a set S of
features is {\delta}-relevant if, when the features in S are fixed, the
probability of getting the target class exceeds {\delta}. However, even for
very simple classifiers, the complexity of computing relevant sets of features
is prohibitive, with the decision problem being NPPP-complete for circuit-based
classifiers. In contrast with earlier negative results, this paper investigates
practical approaches for computing relevant sets for a number of widely used
classifiers that include Decision Trees (DTs), Naive Bayes Classifiers (NBCs),
and several families of classifiers obtained from propositional languages.
Moreover, the paper shows that, in practice, and for these families of
classifiers, relevant sets are easy to compute. Furthermore, the experiments
confirm that succinct sets of relevant features can be obtained for the
families of classifiers considered.Comment: arXiv admin note: text overlap with arXiv:2207.04748,
arXiv:2205.0956
Ubiquitous computing : techniques for filtering inconsistent information
Cette thèse étudie une approche possible de l'intelligence artificielle pour la détection et le curage d'informations perverties dans les bases de connaissances des objets et composants intelligents en informatique ubiquitaire. Cette approche est traitée d'un point de vue pratique dans le cadre du formalisme SAT; il s'agit donc de mettre en œuvre des techniques de filtrage d'incohérences dans des bases contradictoires. Plusieurs contributions sont apportées dans cette thèse. Premièrement, nous avons travaillé sur l'extraction d'un ensemble maximal d'informations qui soit cohérent avec une série de contextes hypothétiques. Nous avons proposé une approche incrémentale pour le calcul d'un tel ensemble (AC-MSS). Deuxièmement, nous nous sommes intéressés à la tâche d'énumération des ensembles maximaux satisfaisables (MSS) ou leurs complémentaires les ensembles minimaux rectificatifs (MCS) d'une instance CNF insatisfaisable. Dans cette contribution, nous avons introduit une technique qui améliore les performances des meilleures approches pour l'énumération des MSS/MCS. Cette méthode implémente le paradigme de rotation de modèle qui permet de calculer des ensembles de MCS de manière heuristique et efficace. Finalement, nous avons étudié une notion de consensus permettant réconcilier des sources d'informations. Cette forme de consensus peut être caractérisée par différents critères de préférence, comme le critère de maximalité. Une approche incrémentale de calcul d'un consensus maximal par rapport à l'inclusion ensembliste a été proposée. Nous avons également introduit et étudié la concept de consensus admissible qui raffine la définition initialement proposée du concept de consensus.This thesis studies a possible approach of artificial intelligence for detecting and filtering inconsistent information in knowledge bases of intelligent objects and components in ubiquitous computing. This approach is addressed from a practical point of view in the SAT framework;it is about implementing a techniques of filtering inconsistencies in contradictory bases. Several contributions are made in this thesis. Firstly, we have worked on the extraction of one maximal information set that must be satisfiable with multiple assumptive contexts. We have proposed an incremental approach for computing such a set (AC-MSS). Secondly, we were interested about the enumeration of maximal satisfiable sets (MSS) or their complementary minimal correction sets (MCS) of an unsatisfiable CNF instance. In this contribution, a technique is introduced that boosts the currently most efficient practical approaches to enumerate MCS. It implements a model rotation paradigm that allows the set of MCS to be computed in an heuristically efficient way. Finally, we have studied a notion of consensus to reconcile several sources of information. This form of consensus can obey various preference criteria, including maximality one. We have then developed an incremental algorithm for computing one maximal consensus with respect to set-theoretical inclusion. We have also introduced and studied the concept of admissible consensus that refines the initial concept of consensus
On Explaining Random Forests with SAT
8 pages, 1 figure, 1 table, IJCAI 2021International audienceRandom Forest (RFs) are among the most widely used Machine Learning (ML) classifiers. Even though RFs are not interpretable, there are no dedicated non-heuristic approaches for computing explanations of RFs. Moreover, there is recent work on polynomial algorithms for explaining ML models, including naive Bayes classifiers. Hence, one question is whether finding explanations of RFs can be solved in polynomial time. This paper answers this question negatively, by proving that computing one PI-explanation of an RF is D^P-complete. Furthermore, the paper proposes a propositional encoding for computing explanations of RFs, thus enabling finding PI-explanations with a SAT solver. This contrasts with earlier work on explaining boosted trees (BTs) and neural networks (NNs), which requires encodings based on SMT/MILP. Experimental results, obtained on a wide range of publicly available datasets, demontrate that the proposed SAT-based approach scales to RFs of sizes common in practical applications. Perhaps more importantly, the experimental results demonstrate that, for the vast majority of examples considered, the SAT-based approach proposed in this paper significantly outperforms existing heuristic approaches
On Explaining Decision Trees
Decision trees (DTs) epitomize what have become to be known as interpretable machine learning (ML) models. This is informally motivated by paths in DTs being often much smaller than the total number of features. This paper shows that in some settings DTs can hardly be deemed interpretable, with paths in a DT being arbitrarily larger than a PI-explanation, i.e. a subset-minimal set of feature values that entails the prediction. As a result, the paper proposes a novel model for computing PI-explanations of DTs, which enables computing one PI-explanation in polynomial time. Moreover, it is shown that enumeration of PI-explanations can be reduced to the enumeration of minimal hitting sets. Experimental results were obtained on a wide range of publicly available datasets with well-known DT-learning tools, and confirm that in most cases DTs have paths that are proper supersets of PI-explanations
Solving Explainability Queries with Quantification: The Case of Feature Relevancy
Trustable explanations of machine learning (ML) models are vital in
high-risk uses of artificial intelligence (AI). Apart from the
computation of trustable explanations, a number of explainability
queries have been identified and studied in recent work. Some of these
queries involve solving quantification problems, either in
propositional or in more expressive logics. This paper investigates
one of these quantification problems, namely the feature relevancy
problem (FRP), i.e.\ to decide whether a (possibly sensitive) feature
can occur in some explanation of a prediction. In contrast with
earlier work, that studied FRP for specific classifiers, this paper
proposes a novel algorithm for the \fprob quantification problem which
is applicable to any ML classifier that meets minor requirements.
Furthermore, the paper shows that the novel algorithm is efficient
in practice. The experimental results, obtained using random forests
(RFs) induced from well-known publicly available datasets,
demonstrate that the proposed solution outperforms existing
state-of-the-art solvers for Quantified Boolean Formulas (QBF) by
orders of magnitude. Finally, the paper also identifies a novel family
of formulas that are challenging for currently state-of-the-art QBF
solvers