Random forests are a very effective and commonly used statistical method, but
their full theoretical analysis is still an open problem. As a first step,
simplified models such as purely random forests have been introduced, in order
to shed light on the good performance of random forests. In this paper, we
study the approximation error (the bias) of some purely random forest models in
a regression framework, focusing in particular on the influence of the number
of trees in the forest. Under some regularity assumptions on the regression
function, we show that the bias of an infinite forest decreases at a faster
rate (with respect to the size of each tree) than a single tree. As a
consequence, infinite forests attain a strictly better risk rate (with respect
to the sample size) than single trees. Furthermore, our results allow to derive
a minimum number of trees sufficient to reach the same rate as an infinite
forest. As a by-product of our analysis, we also show a link between the bias
of purely random forests and the bias of some kernel estimators