82,934 research outputs found
Bidirectional PageRank Estimation: From Average-Case to Worst-Case
We present a new algorithm for estimating the Personalized PageRank (PPR)
between a source and target node on undirected graphs, with sublinear
running-time guarantees over the worst-case choice of source and target nodes.
Our work builds on a recent line of work on bidirectional estimators for PPR,
which obtained sublinear running-time guarantees but in an average-case sense,
for a uniformly random choice of target node. Crucially, we show how the
reversibility of random walks on undirected networks can be exploited to
convert average-case to worst-case guarantees. While past bidirectional methods
combine forward random walks with reverse local pushes, our algorithm combines
forward local pushes with reverse random walks. We also discuss how to modify
our methods to estimate random-walk probabilities for any length distribution,
thereby obtaining fast algorithms for estimating general graph diffusions,
including the heat kernel, on undirected networks.Comment: Workshop on Algorithms and Models for the Web-Graph (WAW) 201
Efficient Triangle Counting in Large Graphs via Degree-based Vertex Partitioning
The number of triangles is a computationally expensive graph statistic which
is frequently used in complex network analysis (e.g., transitivity ratio), in
various random graph models (e.g., exponential random graph model) and in
important real world applications such as spam detection, uncovering of the
hidden thematic structure of the Web and link recommendation. Counting
triangles in graphs with millions and billions of edges requires algorithms
which run fast, use small amount of space, provide accurate estimates of the
number of triangles and preferably are parallelizable.
In this paper we present an efficient triangle counting algorithm which can
be adapted to the semistreaming model. The key idea of our algorithm is to
combine the sampling algorithm of Tsourakakis et al. and the partitioning of
the set of vertices into a high degree and a low degree subset respectively as
in the Alon, Yuster and Zwick work treating each set appropriately. We obtain a
running time
and an approximation (multiplicative error), where is the number
of vertices, the number of edges and the maximum number of
triangles an edge is contained.
Furthermore, we show how this algorithm can be adapted to the semistreaming
model with space usage and a constant number of passes (three) over the graph
stream. We apply our methods in various networks with several millions of edges
and we obtain excellent results. Finally, we propose a random projection based
method for triangle counting and provide a sufficient condition to obtain an
estimate with low variance.Comment: 1) 12 pages 2) To appear in the 7th Workshop on Algorithms and Models
for the Web Graph (WAW 2010
Computing Diffusion State Distance using Green's Function and Heat Kernel on Graphs
The diffusion state distance (DSD) was introduced by
Cao-Zhang-Park-Daniels-Crovella-Cowen-Hescott [{\em PLoS ONE, 2013}] to capture
functional similarity in protein-protein interaction networks. They proved the
convergence of DSD for non-bipartite graphs. In this paper, we extend the DSD
to bipartite graphs using lazy-random walks and consider the general
-version of DSD. We discovered the connection between the DSD
-distance and Green's function, which was studied by Chung and Yau [{\em
J. Combinatorial Theory (A), 2000}]. Based on that, we computed the DSD
-distance for Paths, Cycles, Hypercubes, as well as random graphs
and . We also examined the DSD distances of two biological
networks.Comment: Accepted by the 11th Workshop on Algorithms and Models for the Web
Graph (WAW2014
Challenges in Bridging Social Semantics and Formal Semantics on the Web
This paper describes several results of Wimmics, a research lab which names
stands for: web-instrumented man-machine interactions, communities, and
semantics. The approaches introduced here rely on graph-oriented knowledge
representation, reasoning and operationalization to model and support actors,
actions and interactions in web-based epistemic communities. The re-search
results are applied to support and foster interactions in online communities
and manage their resources
Web Site Personalization based on Link Analysis and Navigational Patterns
The continuous growth in the size and use of the World Wide Web imposes new methods of design and development of on-line information services. The need for predicting the usersâ needs in order to improve the usability and user retention of a web site is more than evident and can be addressed by personalizing it. Recommendation algorithms aim at proposing ânextâ pages to users based on their current visit and the past usersâ navigational patterns. In the vast majority of related algorithms, however, only the usage data are used to produce recommendations, disregarding the structural properties of the web graph. Thus important â in terms of PageRank authority score â pages may be underrated. In this work we present UPR, a PageRank-style algorithm which combines usage data and link analysis techniques for assigning probabilities to the web pages based on their importance in the web siteâs navigational graph. We propose the application of a localized version of UPR (l-UPR) to personalized navigational sub-graphs for online web page ranking and recommendation. Moreover, we propose a hybrid probabilistic predictive model based on Markov models and link analysis for assigning prior probabilities in a hybrid probabilistic model. We prove, through experimentation, that this approach results in more objective and representative predictions than the ones produced from the pure usage-based approaches
- âŠ