Passenger clustering based on trajectory records is essential for
transportation operators. However, existing methods cannot easily cluster the
passengers due to the hierarchical structure of the passenger trip information,
including multiple trips within each passenger and multi-dimensional
information about each trip. Furthermore, existing approaches rely on an
accurate specification of the clustering number to start. Finally, existing
methods do not consider spatial semantic graphs such as geographical proximity
and functional similarity between the locations. In this paper, we propose a
novel tensor Dirichlet Process Multinomial Mixture model with graphs, which can
preserve the hierarchical structure of the multi-dimensional trip information
and cluster them in a unified one-step manner with the ability to determine the
number of clusters automatically. The spatial graphs are utilized in community
detection to link the semantic neighbors. We further propose a tensor version
of Collapsed Gibbs Sampling method with a minimum cluster size requirement. A
case study based on Hong Kong metro passenger data is conducted to demonstrate
the automatic process of cluster amount evolution and better cluster quality
measured by within-cluster compactness and cross-cluster separateness. The code
is available at https://github.com/bonaldli/TensorDPMM-G.Comment: Accepted in ACM SIGSPATIAL 2023. arXiv admin note: substantial text
overlap with arXiv:2306.1379