Large-scale networks are commonly encountered in practice (e.g., Facebook and
Twitter) by researchers. In order to study the network interaction between
different nodes of large-scale networks, the spatial autoregressive (SAR) model
has been popularly employed. Despite its popularity, the estimation of a SAR
model on large-scale networks remains very challenging. On the one hand, due to
policy limitations or high collection costs, it is often impossible for
independent researchers to observe or collect all network information. On the
other hand, even if the entire network is accessible, estimating the SAR model
using the quasi-maximum likelihood estimator (QMLE) could be computationally
infeasible due to its high computational cost. To address these challenges, we
propose here a subnetwork estimation method based on QMLE for the SAR model. By
using appropriate sampling methods, a subnetwork, consisting of a much-reduced
number of nodes, can be constructed. Subsequently, the standard QMLE can be
computed by treating the sampled subnetwork as if it were the entire network.
This leads to a significant reduction in information collection and model
computation costs, which increases the practical feasibility of the effort.
Theoretically, we show that the subnetwork-based QMLE is consistent and
asymptotically normal under appropriate regularity conditions. Extensive
simulation studies, based on both simulated and real network structures, are
presented