As a modern ensemble technique, Deep Forest (DF) employs a cascading
structure to construct deep models, providing stronger representational power
compared to traditional decision forests. However, its greedy multi-layer
learning procedure is prone to overfitting, limiting model effectiveness and
generalizability. This paper presents an optimized Deep Forest, featuring
learnable, layerwise data augmentation policy schedules. Specifically, We
introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate
overfitting and develop a population-based search algorithm to tailor
augmentation intensity for each layer. Additionally, we propose to incorporate
outputs from intermediate layers into a checkpoint ensemble for more stable
performance. Experimental results show that our method sets new
state-of-the-art (SOTA) benchmarks in various tabular classification tasks,
outperforming shallow tree ensembles, deep forests, deep neural network, and
AutoML competitors. The learned policies also transfer effectively to Deep
Forest variants, underscoring its potential for enhancing non-differentiable
deep learning modules in tabular signal processing