In many areas of science multiple sets of data are collected pertaining to
the same system. Examples are food products which are characterized by
different sets of variables, bio-processes which are on-line sampled with
different instruments, or biological systems of which different genomics
measurements are obtained. Data fusion is concerned with analyzing such sets of
data simultaneously to arrive at a global view of the system under study. One
of the upcoming areas of data fusion is exploring whether the data sets have
something in common or not. This gives insight into common and distinct
variation in each data set, thereby facilitating understanding the
relationships between the data sets. Unfortunately, research on methods to
distinguish common and distinct components is fragmented, both in terminology
as well as in methods: there is no common ground which hampers comparing
methods and understanding their relative merits. This paper provides a unifying
framework for this subfield of data fusion by using rigorous arguments from
linear algebra. The most frequently used methods for distinguishing common and
distinct components are explained in this framework and some practical examples
are given of these methods in the areas of (medical) biology and food science.Comment: 50 pages, 12 figure