The knowledge of end-to-end network distances is essential to many Internet
applications. As active probing of all pairwise distances is infeasible in
large-scale networks, a natural idea is to measure a few pairs and to predict
the other ones without actually measuring them. This paper formulates the
distance prediction problem as matrix completion where unknown entries of an
incomplete matrix of pairwise distances are to be predicted. The problem is
solvable because strong correlations among network distances exist and cause
the constructed distance matrix to be low rank. The new formulation circumvents
the well-known drawbacks of existing approaches based on Euclidean embedding.
A new algorithm, so-called Decentralized Matrix Factorization by Stochastic
Gradient Descent (DMFSGD), is proposed to solve the network distance prediction
problem. By letting network nodes exchange messages with each other, the
algorithm is fully decentralized and only requires each node to collect and to
process local measurements, with neither explicit matrix constructions nor
special nodes such as landmarks and central servers. In addition, we compared
comprehensively matrix factorization and Euclidean embedding to demonstrate the
suitability of the former on network distance prediction. We further studied
the incorporation of a robust loss function and of non-negativity constraints.
Extensive experiments on various publicly-available datasets of network delays
show not only the scalability and the accuracy of our approach but also its
usability in real Internet applications.Comment: submitted to IEEE/ACM Transactions on Networking on Nov. 201