4 research outputs found
Monotone Tree-Based GAMI Models by Adapting XGBoost
Recent papers have used machine learning architecture to fit low-order
functional ANOVA models with main effects and second-order interactions. These
GAMI (GAM + Interaction) models are directly interpretable as the functional
main effects and interactions can be easily plotted and visualized.
Unfortunately, it is not easy to incorporate the monotonicity requirement into
the existing GAMI models based on boosted trees, such as EBM (Lou et al. 2013)
and GAMI-Lin-T (Hu et al. 2022). This paper considers models of the form
and develops monotone tree-based GAMI
models, called monotone GAMI-Tree, by adapting the XGBoost algorithm. It is
straightforward to fit a monotone model to using the options in XGBoost.
However, the fitted model is still a black box. We take a different approach:
i) use a filtering technique to determine the important interactions, ii) fit a
monotone XGBoost algorithm with the selected interactions, and finally iii)
parse and purify the results to get a monotone GAMI model. Simulated datasets
are used to demonstrate the behaviors of mono-GAMI-Tree and EBM, both of which
use piecewise constant fits. Note that the monotonicity requirement is for the
full model. Under certain situations, the main effects will also be monotone.
But, as seen in the examples, the interactions will not be monotone.Comment: 12 page
Using Model-Based Trees with Boosting to Fit Low-Order Functional ANOVA Models
Low-order functional ANOVA (fANOVA) models have been rediscovered in the
machine learning (ML) community under the guise of inherently interpretable
machine learning. Explainable Boosting Machines or EBM (Lou et al. 2013) and
GAMI-Net (Yang et al. 2021) are two recently proposed ML algorithms for fitting
functional main effects and second-order interactions. We propose a new
algorithm, called GAMI-Tree, that is similar to EBM, but has a number of
features that lead to better performance. It uses model-based trees as base
learners and incorporates a new interaction filtering method that is better at
capturing the underlying interactions. In addition, our iterative training
method converges to a model with better predictive performance, and the
embedded purification ensures that interactions are hierarchically orthogonal
to main effects. The algorithm does not need extensive tuning, and our
implementation is fast and efficient. We use simulated and real datasets to
compare the performance and interpretability of GAMI-Tree with EBM and
GAMI-Net.Comment: 25 pages plus appendi
Shapley Computations Using Surrogate Model-Based Trees
Shapley-related techniques have gained attention as both global and local
interpretation tools because of their desirable properties. However, their
computation using conditional expectations is computationally expensive.
Approximation methods suggested in the literature have limitations. This paper
proposes the use of a surrogate model-based tree to compute Shapley and SHAP
values based on conditional expectation. Simulation studies show that the
proposed algorithm provides improvements in accuracy, unifies global Shapley
and SHAP interpretation, and the thresholding method provides a way to
trade-off running time and accuracy