8,933 research outputs found
Modeling Distances in Large-Scale Networks by Matrix Factorization
In this paper, we propose a model for representing and predicting distances in large-scale networks by matrix factorization. The model is useful for network distance sensitive applications, such as content distribution networks, topology-aware overlays, and server selections. Our approach overcomes several limitations of previous coordinates-based mechanisms, which cannot model sub-optimal routing or asymmetric routing policies. We describe two algorithms -- singular value decomposition (SVD) and nonnegative matrix factorization (NMF) -- for representing a matrix of network distances as the product of two smaller matrices. With such a representation, we build a scalable system -- Internet Distance Estimation Service (IDES) -- that predicts large numbers of network distances from limited numbers of measurements. Extensive simulations on real-world data sets show that IDES leads to more accurate, efficient and robust predictions of latencies in large-scale networks than previous approaches
DMFSGD: A Decentralized Matrix Factorization Algorithm for Network Distance Prediction
The knowledge of end-to-end network distances is essential to many Internet
applications. As active probing of all pairwise distances is infeasible in
large-scale networks, a natural idea is to measure a few pairs and to predict
the other ones without actually measuring them. This paper formulates the
distance prediction problem as matrix completion where unknown entries of an
incomplete matrix of pairwise distances are to be predicted. The problem is
solvable because strong correlations among network distances exist and cause
the constructed distance matrix to be low rank. The new formulation circumvents
the well-known drawbacks of existing approaches based on Euclidean embedding.
A new algorithm, so-called Decentralized Matrix Factorization by Stochastic
Gradient Descent (DMFSGD), is proposed to solve the network distance prediction
problem. By letting network nodes exchange messages with each other, the
algorithm is fully decentralized and only requires each node to collect and to
process local measurements, with neither explicit matrix constructions nor
special nodes such as landmarks and central servers. In addition, we compared
comprehensively matrix factorization and Euclidean embedding to demonstrate the
suitability of the former on network distance prediction. We further studied
the incorporation of a robust loss function and of non-negativity constraints.
Extensive experiments on various publicly-available datasets of network delays
show not only the scalability and the accuracy of our approach but also its
usability in real Internet applications.Comment: submitted to IEEE/ACM Transactions on Networking on Nov. 201
Large-Scale User Modeling with Recurrent Neural Networks for Music Discovery on Multiple Time Scales
The amount of content on online music streaming platforms is immense, and
most users only access a tiny fraction of this content. Recommender systems are
the application of choice to open up the collection to these users.
Collaborative filtering has the disadvantage that it relies on explicit
ratings, which are often unavailable, and generally disregards the temporal
nature of music consumption. On the other hand, item co-occurrence algorithms,
such as the recently introduced word2vec-based recommenders, are typically left
without an effective user representation. In this paper, we present a new
approach to model users through recurrent neural networks by sequentially
processing consumed items, represented by any type of embeddings and other
context features. This way we obtain semantically rich user representations,
which capture a user's musical taste over time. Our experimental analysis on
large-scale user data shows that our model can be used to predict future songs
a user will likely listen to, both in the short and long term.Comment: Author pre-print version, 20 pages, 6 figures, 4 table
Statistical Traffic State Analysis in Large-scale Transportation Networks Using Locality-Preserving Non-negative Matrix Factorization
Statistical traffic data analysis is a hot topic in traffic management and
control. In this field, current research progresses focus on analyzing traffic
flows of individual links or local regions in a transportation network. Less
attention are paid to the global view of traffic states over the entire
network, which is important for modeling large-scale traffic scenes. Our aim is
precisely to propose a new methodology for extracting spatio-temporal traffic
patterns, ultimately for modeling large-scale traffic dynamics, and long-term
traffic forecasting. We attack this issue by utilizing Locality-Preserving
Non-negative Matrix Factorization (LPNMF) to derive low-dimensional
representation of network-level traffic states. Clustering is performed on the
compact LPNMF projections to unveil typical spatial patterns and temporal
dynamics of network-level traffic states. We have tested the proposed method on
simulated traffic data generated for a large-scale road network, and reported
experimental results validate the ability of our approach for extracting
meaningful large-scale space-time traffic patterns. Furthermore, the derived
clustering results provide an intuitive understanding of spatial-temporal
characteristics of traffic flows in the large-scale network, and a basis for
potential long-term forecasting.Comment: IET Intelligent Transport Systems (2013
- …