Institute of Electrical and Electronics Engineers (IEEE)
Doi
Abstract
peer-reviewedThe task of multi-target regression (MTR) is concerned with learning predictive models
capable of predicting multiple target variables simultaneously. MTR has attracted an increasing attention
within research community in recent years, yielding a variety of methods. The methods can be divided
into two main groups: problem transformation and problem adaptation. The former transform a MTR
problem into simpler (typically single target) problems and apply known approaches, while the latter
adapt the learning methods to directly handle the multiple target variables and learn better models which
simultaneously predict all of the targets. Studies have identified the latter group of methods as having
competitive advantage over the former, probably due to the fact that it exploits the interrelations of the
multiple targets. In the related task of multi-label classification, it has been recently shown that organizing
the multiple labels into a hierarchical structure can improve predictive performance.
In this paper, we investigate whether organizing the targets into a hierarchical structure can improve the
performance for MTR problems. More precisely, we propose to structure the multiple target variables into
a hierarchy of variables, thus translating the task of MTR into a task of hierarchical multi-target regression
(HMTR). We use four data-driven methods for devising the hierarchical structure that cluster the real values
of the targets or the feature importance scores with respect to the targets. The evaluation of the proposed
methodology on 16 benchmark MTR datasets reveals that structuring the multiple target variables into a
hierarchy improves the predictive performance of the corresponding MTR models. The results also show
that data-driven methods produce hierarchies that can improve the predictive performance even more than
expert constructed hierarchies. Finally, the improvement in predictive performance is more pronounced for
the datasets with very large numbers (more than hundred) of targets.European Commissio