research

Feature Selection with the Boruta Package

Abstract

This article describes a R package Boruta, implementing a novel feature selection algorithm for finding \emph{all relevant variables}. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented.

    Similar works