This paper addresses the case where data come as point sets, or more
generally as discrete measures. Our motivation is twofold: first we intend to
approximate with a compactly supported measure the mean of the measure
generating process, that coincides with the intensity measure in the point
process framework, or with the expected persistence diagram in the framework of
persistence-based topological data analysis. To this aim we provide two
algorithms that we prove almost minimax optimal. Second we build from the
estimator of the mean measure a vectorization map, that sends every measure
into a finite-dimensional Euclidean space, and investigate its properties
through a clustering-oriented lens. In a nutshell, we show that in a mixture of
measure generating process, our technique yields a representation in
Rk, for k∈N∗ that guarantees a good clustering of
the data points with high probability. Interestingly, our results apply in the
framework of persistence-based shape classification via the ATOL procedure
described in \cite{Royer19}