Understanding human mobility is of vital importance for urban planning,
epidemiology, and many other fields that aim to draw policies from the
activities of humans in space. Despite recent availability of large scale data
sets related to human mobility such as GPS traces, mobile phone data, etc., it
is still true that such data sets represent a subsample of the population of
interest, and then might give an incomplete picture of the entire population in
question. Notwithstanding the abundant usage of such inherently limited data
sets, the impact of sampling biases on mobility patterns is unclear -- we do
not have methods available to reliably infer mobility information from a
limited data set. Here, we investigate the effects of sampling using a data set
of millions of taxi movements in New York City. On the one hand, we show that
mobility patterns are highly stable once an appropriate simple rescaling is
applied to the data, implying negligible loss of information due to subsampling
over long time scales. On the other hand, contrasting an appropriate null model
on the weighted network of vehicle flows reveals distinctive features which
need to be accounted for. Accordingly, we formulate a "supersampling"
methodology which allows us to reliably extrapolate mobility data from a
reduced sample and propose a number of network-based metrics to reliably assess
its quality (and that of other human mobility models). Our approach provides a
well founded way to exploit temporal patterns to save effort in recording
mobility data, and opens the possibility to scale up data from limited records
when information on the full system is needed.Comment: 14 pages, 4 figure