749 research outputs found
Fast Ad-Hoc Search Algorithm for Personalized PageRank
Personalized PageRank (PPR) is a typical similarity metric between nodes in a graph, and node searches based on PPR are widely used. In many applications, graphs change dynamically, and in such cases, it is desirable to perform ad hoc searches based on PPR. An ad hoc search involves performing searches by varying the search parameters or graphs. However, as the size of a graph increases, the computation cost of performing an ad hoc search can become excessive. In this paper, we propose a method called Castanet that offers fast ad hoc searches of PPR. The proposed method features (1) iterative estimation of the upper and lower bounds of PPR scores, and (2) dynamic pruning of nodes that are not needed to obtain a search result. Experiments confirm that the proposed method does offer faster ad hoc PPR searches than existing methods
Supervised Random Walks: Predicting and Recommending Links in Social Networks
Predicting the occurrence of links is a fundamental problem in networks. In
the link prediction problem we are given a snapshot of a network and would like
to infer which interactions among existing members are likely to occur in the
near future or which existing interactions are we missing. Although this
problem has been extensively studied, the challenge of how to effectively
combine the information from the network structure with rich node and edge
attribute data remains largely open.
We develop an algorithm based on Supervised Random Walks that naturally
combines the information from the network structure with node and edge level
attributes. We achieve this by using these attributes to guide a random walk on
the graph. We formulate a supervised learning task where the goal is to learn a
function that assigns strengths to edges in the network such that a random
walker is more likely to visit the nodes to which new links will be created in
the future. We develop an efficient training algorithm to directly learn the
edge strength estimation function.
Our experiments on the Facebook social graph and large collaboration networks
show that our approach outperforms state-of-the-art unsupervised approaches as
well as approaches that are based on feature extraction
Discriminative Link Prediction using Local Links, Node Features and Community Structure
A link prediction (LP) algorithm is given a graph, and has to rank, for each
node, other nodes that are candidates for new linkage. LP is strongly motivated
by social search and recommendation applications. LP techniques often focus on
global properties (graph conductance, hitting or commute times, Katz score) or
local properties (Adamic-Adar and many variations, or node feature vectors),
but rarely combine these signals. Furthermore, neither of these extremes
exploit link densities at the intermediate level of communities. In this paper
we describe a discriminative LP algorithm that exploits two new signals. First,
a co-clustering algorithm provides community level link density estimates,
which are used to qualify observed links with a surprise value. Second, links
in the immediate neighborhood of the link to be predicted are not interpreted
at face value, but through a local model of node feature similarities. These
signals are combined into a discriminative link predictor. We evaluate the new
predictor using five diverse data sets that are standard in the literature. We
report on significant accuracy boosts compared to standard LP methods
(including Adamic-Adar and random walk). Apart from the new predictor, another
contribution is a rigorous protocol for benchmarking and reporting LP
algorithms, which reveals the regions of strengths and weaknesses of all the
predictors studied here, and establishes the new proposal as the most robust.Comment: 10 pages, 5 figure
The contribution of data mining to information science
The information explosion is a serious challenge for current information institutions. On the other hand, data mining, which is the search for valuable information in large volumes of data, is one of the solutions to face this challenge. In the past several years, data mining has made a significant contribution to the field of information science. This paper examines the impact of data mining by reviewing existing applications, including personalized environments, electronic commerce, and search engines. For these three types of application, how data mining can enhance their functions is discussed. The reader of this paper is expected to get an overview of the state of the art research associated with these applications. Furthermore, we identify the limitations of current work and raise several directions for future research
- …