Obtaining sparse, interpretable representations of observable data is crucial
in many machine learning and signal processing tasks. For data representing
flows along the edges of a graph, an intuitively interpretable way to obtain
such representations is to lift the graph structure to a simplicial complex:
The eigenvectors of the associated Hodge-Laplacian, respectively the incidence
matrices of the corresponding simplicial complex then induce a Hodge
decomposition, which can be used to represent the observed data in terms of
gradient, curl, and harmonic flows. In this paper, we generalize this approach
to cellular complexes and introduce the cell inference optimization problem,
i.e., the problem of augmenting the observed graph by a set of cells, such that
the eigenvectors of the associated Hodge Laplacian provide a sparse,
interpretable representation of the observed edge flows on the graph. We show
that this problem is NP-hard and introduce an efficient approximation algorithm
for its solution. Experiments on real-world and synthetic data demonstrate that
our algorithm outperforms current state-of-the-art methods while being
computationally efficient.Comment: 9 pages, 6 figures (plus appendix). For evaluation code, see
https://anonymous.4open.science/r/edge-flow-repr-cell-complexes-11C