In this work, learning schemes for measure-valued data are proposed, i.e.
data that their structure can be more efficiently represented as probability
measures instead of points on Rd, employing the concept of probability
barycenters as defined with respect to the Wasserstein metric. Such type of
learning approaches are highly appreciated in many fields where the
observational/experimental error is significant (e.g. astronomy, biology,
remote sensing, etc.) or the data nature is more complex and the traditional
learning algorithms are not applicable or effective to treat them (e.g. network
data, interval data, high frequency records, matrix data, etc.). Under this
perspective, each observation is identified by an appropriate probability
measure and the proposed statistical learning schemes rely on discrimination
criteria that utilize the geometric structure of the space of probability
measures through core techniques from the optimal transport theory. The
discussed approaches are implemented in two real world applications: (a)
clustering eurozone countries according to their observed government bond yield
curves and (b) classifying the areas of a satellite image to certain land uses
categories which is a standard task in remote sensing. In both case studies the
results are particularly interesting and meaningful while the accuracy obtained
is high.Comment: 18 pages, 3 figure