Given a finite set in a metric space, the topological analysis generalizes
hierarchical clustering using a 1-parameter family of homology groups to
quantify connectivity in all dimensions. The connectivity is compactly
described by the persistence diagram. One limitation of the current framework
is the reliance on metric distances, whereas in many practical applications
objects are compared by non-metric dissimilarity measures. Examples are the
Kullback-Leibler divergence, which is commonly used for comparing text and
images, and the Itakura-Saito divergence, popular for speech and sound. These
are two members of the broad family of dissimilarities called Bregman
divergences.
We show that the framework of topological data analysis can be extended to
general Bregman divergences, widening the scope of possible applications. In
particular, we prove that appropriately generalized Cech and Delaunay (alpha)
complexes capture the correct homotopy type, namely that of the corresponding
union of Bregman balls. Consequently, their filtrations give the correct
persistence diagram, namely the one generated by the uniformly growing Bregman
balls. Moreover, we show that unlike the metric setting, the filtration of
Vietoris-Rips complexes may fail to approximate the persistence diagram. We
propose algorithms to compute the thus generalized Cech, Vietoris-Rips and
Delaunay complexes and experimentally test their efficiency. Lastly, we explain
their surprisingly good performance by making a connection with discrete Morse
theory