Abstract Decision Tree Pruning via Integer Programming ⋆

Abstract

Decision tree is an important tool for classification in data mining. Many algorithms have been proposed to induce decision trees and most of them involve two phases, a growing phase and a pruning phase. In this paper, we concentrate on the pruning problem. We find that with the ultimate aim of selecting the best sub-tree with the minimal error for a separate test set, the problem can be formulated as an integer program with a nice structure. By exploiting the special structure of this integer program, we propose several interesting algorithms to identify the optimal sub-tree, including the one that is essentially the same as the well-known bottom-up pruning method with computational complexity of O(n). A new optimality proof of the above algorithm is provided from the perspective of mathematical programming

    Similar works

    Full text

    thumbnail-image

    Available Versions