Examining graphs for similarity is a well-known challenge, but one that is
mandatory for grouping graphs together. We present a data-driven method to
cluster traffic scenes that is self-supervised, i.e. without manual labelling.
We leverage the semantic scene graph model to create a generic graph embedding
of the traffic scene, which is then mapped to a low-dimensional embedding space
using a Siamese network, in which clustering is performed. In the training
process of our novel approach, we augment existing traffic scenes in the
Cartesian space to generate positive similarity samples. This allows us to
overcome the challenge of reconstructing a graph and at the same time obtain a
representation to describe the similarity of traffic scenes. We could show,
that the resulting clusters possess common semantic characteristics. The
approach was evaluated on the INTERACTION dataset