Multi-output Gaussian processes (MOGPs) have been introduced to deal with
multiple tasks by exploiting the correlations between different outputs.
Generally, MOGPs models assume a flat correlation structure between the
outputs. However, such a formulation does not account for more elaborate
relationships, for instance, if several replicates were observed for each
output (which is a typical setting in biological experiments). This paper
proposes an extension of MOGPs for hierarchical datasets (i.e. datasets for
which the relationships between observations can be represented within a tree
structure). Our model defines a tailored kernel function accounting for
hierarchical structures in the data to capture different levels of correlations
while leveraging the introduction of latent variables to express the underlying
dependencies between outputs through a dedicated kernel. This latter feature is
expected to significantly improve scalability as the number of tasks increases.
An extensive experimental study involving both synthetic and real-world data
from genomics and motion capture is proposed to support our claims.Comment: 29 page