Persistent homology is a popular and powerful tool for capturing topological
features of data. Advances in algorithms for computing persistent homology have
reduced the computation time drastically -- as long as the algorithm does not
exhaust the available memory. Following up on a recently presented parallel
method for persistence computation on shared memory systems, we demonstrate
that a simple adaption of the standard reduction algorithm leads to a variant
for distributed systems. Our algorithmic design ensures that the data is
distributed over the nodes without redundancy; this permits the computation of
much larger instances than on a single machine. Moreover, we observe that the
parallelism at least compensates for the overhead caused by communication
between nodes, and often even speeds up the computation compared to sequential
and even parallel shared memory algorithms. In our experiments, we were able to
compute the persistent homology of filtrations with more than a billion (10^9)
elements within seconds on a cluster with 32 nodes using less than 10GB of
memory per node