4,585 research outputs found
DMFSGD: A Decentralized Matrix Factorization Algorithm for Network Distance Prediction
The knowledge of end-to-end network distances is essential to many Internet
applications. As active probing of all pairwise distances is infeasible in
large-scale networks, a natural idea is to measure a few pairs and to predict
the other ones without actually measuring them. This paper formulates the
distance prediction problem as matrix completion where unknown entries of an
incomplete matrix of pairwise distances are to be predicted. The problem is
solvable because strong correlations among network distances exist and cause
the constructed distance matrix to be low rank. The new formulation circumvents
the well-known drawbacks of existing approaches based on Euclidean embedding.
A new algorithm, so-called Decentralized Matrix Factorization by Stochastic
Gradient Descent (DMFSGD), is proposed to solve the network distance prediction
problem. By letting network nodes exchange messages with each other, the
algorithm is fully decentralized and only requires each node to collect and to
process local measurements, with neither explicit matrix constructions nor
special nodes such as landmarks and central servers. In addition, we compared
comprehensively matrix factorization and Euclidean embedding to demonstrate the
suitability of the former on network distance prediction. We further studied
the incorporation of a robust loss function and of non-negativity constraints.
Extensive experiments on various publicly-available datasets of network delays
show not only the scalability and the accuracy of our approach but also its
usability in real Internet applications.Comment: submitted to IEEE/ACM Transactions on Networking on Nov. 201
Interaction-aware Factorization Machines for Recommender Systems
Factorization Machine (FM) is a widely used supervised learning approach by
effectively modeling of feature interactions. Despite the successful
application of FM and its many deep learning variants, treating every feature
interaction fairly may degrade the performance. For example, the interactions
of a useless feature may introduce noises; the importance of a feature may also
differ when interacting with different features. In this work, we propose a
novel model named \emph{Interaction-aware Factorization Machine} (IFM) by
introducing Interaction-Aware Mechanism (IAM), which comprises the
\emph{feature aspect} and the \emph{field aspect}, to learn flexible
interactions on two levels. The feature aspect learns feature interaction
importance via an attention network while the field aspect learns the feature
interaction effect as a parametric similarity of the feature interaction vector
and the corresponding field interaction prototype. IFM introduces more
structured control and learns feature interaction importance in a stratified
manner, which allows for more leverage in tweaking the interactions on both
feature-wise and field-wise levels. Besides, we give a more generalized
architecture and propose Interaction-aware Neural Network (INN) and DeepIFM to
capture higher-order interactions. To further improve both the performance and
efficiency of IFM, a sampling scheme is developed to select interactions based
on the field aspect importance. The experimental results from two well-known
datasets show the superiority of the proposed models over the state-of-the-art
methods
Comparative Analysis of Word Embeddings for Capturing Word Similarities
Distributed language representation has become the most widely used technique
for language representation in various natural language processing tasks. Most
of the natural language processing models that are based on deep learning
techniques use already pre-trained distributed word representations, commonly
called word embeddings. Determining the most qualitative word embeddings is of
crucial importance for such models. However, selecting the appropriate word
embeddings is a perplexing task since the projected embedding space is not
intuitive to humans. In this paper, we explore different approaches for
creating distributed word representations. We perform an intrinsic evaluation
of several state-of-the-art word embedding methods. Their performance on
capturing word similarities is analysed with existing benchmark datasets for
word pairs similarities. The research in this paper conducts a correlation
analysis between ground truth word similarities and similarities obtained by
different word embedding methods.Comment: Part of the 6th International Conference on Natural Language
Processing (NATP 2020
- …