Hidden variables unseen by Random Forests

Abstract

Random Forests are widely claimed to capture interactions well. However, some simple examples suggest that they perform poorly in the presence of certain pure interactions that the conventional CART criterion struggles to capture during tree construction. We argue that alternative partitioning schemes can enhance identification of these interactions. Furthermore, we extend recent theory of Random Forests based on the notion of impurity decrease by considering probabilistic impurity decrease conditions. Within this framework, consistency of a new algorithm coined 'Random Split Random Forest' tailored to address function classes involving pure interactions is established. In a simulation study, we validate that the modifications considered enhance the model's fitting ability in scenarios where pure interactions play a crucial role

    Similar works

    Full text

    thumbnail-image

    Available Versions