Optimal Sparse Regression Trees

Rudin, Cynthia; Seltzer, Margo; Xin, Rui; Zhang, Rui

Optimal Sparse Regression Trees

Authors: Cynthia Rudin
Margo Seltzer
Rui Xin
Rui Zhang
Publication date: 9 April 2023
Publisher

Abstract

Regression trees are one of the oldest forms of AI models, and their predictions can be made without a calculator, which makes them broadly useful, particularly for high-stakes applications. Within the large literature on regression trees, there has been little effort towards full provable optimization, mainly due to the computational hardness of the problem. This work proposes a dynamic-programming-with-bounds approach to the construction of provably-optimal sparse regression trees. We leverage a novel lower bound based on an optimal solution to the k-Means clustering algorithm in 1-dimension over the set of labels. We are often able to find optimal sparse trees in seconds, even for challenging datasets that involve large numbers of samples and highly-correlated features.Comment: AAAI 2023, final archival versio

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2211.14980

Last time updated on 04/01/2023