23,544 research outputs found
Recommender Systems
The ongoing rapid expansion of the Internet greatly increases the necessity
of effective recommender systems for filtering the abundant information.
Extensive research for recommender systems is conducted by a broad range of
communities including social and computer scientists, physicists, and
interdisciplinary researchers. Despite substantial theoretical and practical
achievements, unification and comparison of different approaches are lacking,
which impedes further advances. In this article, we review recent developments
in recommender systems and discuss the major challenges. We compare and
evaluate available algorithms and examine their roles in the future
developments. In addition to algorithms, physical aspects are described to
illustrate macroscopic behavior of recommender systems. Potential impacts and
future directions are discussed. We emphasize that recommendation has a great
scientific depth and combines diverse research fields which makes it of
interests for physicists as well as interdisciplinary researchers.Comment: 97 pages, 20 figures (To appear in Physics Reports
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
Discriminative Link Prediction using Local Links, Node Features and Community Structure
A link prediction (LP) algorithm is given a graph, and has to rank, for each
node, other nodes that are candidates for new linkage. LP is strongly motivated
by social search and recommendation applications. LP techniques often focus on
global properties (graph conductance, hitting or commute times, Katz score) or
local properties (Adamic-Adar and many variations, or node feature vectors),
but rarely combine these signals. Furthermore, neither of these extremes
exploit link densities at the intermediate level of communities. In this paper
we describe a discriminative LP algorithm that exploits two new signals. First,
a co-clustering algorithm provides community level link density estimates,
which are used to qualify observed links with a surprise value. Second, links
in the immediate neighborhood of the link to be predicted are not interpreted
at face value, but through a local model of node feature similarities. These
signals are combined into a discriminative link predictor. We evaluate the new
predictor using five diverse data sets that are standard in the literature. We
report on significant accuracy boosts compared to standard LP methods
(including Adamic-Adar and random walk). Apart from the new predictor, another
contribution is a rigorous protocol for benchmarking and reporting LP
algorithms, which reveals the regions of strengths and weaknesses of all the
predictors studied here, and establishes the new proposal as the most robust.Comment: 10 pages, 5 figure
Uncovering missing links with cold ends
To evaluate the performance of prediction of missing links, the known data
are randomly divided into two parts, the training set and the probe set. We
argue that this straightforward and standard method may lead to terrible bias,
since in real biological and information networks, missing links are more
likely to be links connecting low-degree nodes. We therefore study how to
uncover missing links with low-degree nodes, namely links in the probe set are
of lower degree products than a random sampling. Experimental analysis on ten
local similarity indices and four disparate real networks reveals a surprising
result that the Leicht-Holme-Newman index [E. A. Leicht, P. Holme, and M. E. J.
Newman, Phys. Rev. E 73, 026120 (2006)] performs the best, although it was
known to be one of the worst indices if the probe set is a random sampling of
all links. We further propose an parameter-dependent index, which considerably
improves the prediction accuracy. Finally, we show the relevance of the
proposed index on three real sampling methods.Comment: 16 pages, 5 figures, 6 table
- …