Statistical Learning for Structured Models: Tree Based Methods and Neural Networks

Abstract

In this thesis, estimation in regression and classification problems which include low dimensional structures are considered. The underlying question is the following. How well do statistical learn- ing methods perform for models with low dimensional structures? We approach this question using various algorithms in various settings. For our first main contribution, we prove optimal convergence rates in a classification setting using neural networks. While non-optimal rates ex- isted for this problem, we are the first to prove optimal ones. Secondly, we introduce a new tree based algorithm we named random planted forest. It adapts particularly well to models which consist of low dimensional structures. We examine its performance in simulation studies and include some theoretical backing by proving optimal convergence rates in certain settings for a modification of the algorithm. Additionally, a generalized version of the algorithm is included, which can be used in classification settings. In a further contribution, we prove optimal con- vergence rates for the local linear smooth backfitting algorithm. While such rates have already been established, we bring a new simpler perspective to the problem which leads to better understanding and easier interpretation. Additionally, given an estimator in a regression setting, we propose a constraint which leads to a unique decomposition. This decomposition is useful for visualising and interpreting the estimator, in particular if it consits of low dimenional structures

    Similar works