The shortest-path, commute time, and diffusion distances on undirected graphs
have been widely employed in applications such as dimensionality reduction,
link prediction, and trip planning. Increasingly, there is interest in using
asymmetric structure of data derived from Markov chains and directed graphs,
but few metrics are specifically adapted to this task. We introduce a metric on
the state space of any ergodic, finite-state, time-homogeneous Markov chain
and, in particular, on any Markov chain derived from a directed graph. Our
construction is based on hitting probabilities, with nearness in the metric
space related to the transfer of random walkers from one node to another at
stationarity. Notably, our metric is insensitive to shortest and average walk
distances, thus giving new information compared to existing metrics. We use
possible degeneracies in the metric to develop an interesting structural theory
of directed graphs and explore a related quotienting procedure. Our metric can
be computed in O(n3) time, where n is the number of states, and in
examples we scale up to n=10,000 nodes and ≈38M edges on a desktop
computer. In several examples, we explore the nature of the metric, compare it
to alternative methods, and demonstrate its utility for weak recovery of
community structure in dense graphs, visualization, structure recovering,
dynamics exploration, and multiscale cluster detection.Comment: 26 pages, 9 figures, for associated code, visit
https://github.com/zboyd2/hitting_probabilities_metric, accepted at SIAM J.
Math. Data Sc