142 research outputs found
Parallel Page Rank Algorithms: A Survey
The PageRank method is an important and basic component in effective web search to compute the rank score of each page. The exponential growth of the Internet makes a crucial challenges for search engines to provide up-to-date and relevant user?s query search results within time period. The PageRank method computed on huge number of web pages and this is computation intensive task. In this paper, we provide the basic concept of PageRank method and discuss some Parallel PageRank methods. We also compare some Parallel algorithmic concepts like load balance, distributed vs. shared memory and data layout on these algorithms
Monte Carlo methods in PageRank computation: When one iteration is sufficient
PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as a frequency of visiting a Web page by a random surfer and thus it reflects the popularity of a Web page. Google computes the PageRank using the power iteration method which requires about one week of intensive computations. In the present work we propose and analyze Monte Carlo type methods for the PageRank computation. There are several advantages of the probabilistic Monte Carlo methods over the deterministic power iteration method: Monte Carlo methods provide good estimation of the PageRank for relatively important pages already after one iteration; Monte Carlo methods have natural parallel implementation; and finally, Monte Carlo methods allow to perform continuous update of the PageRank as the structure of the Web changes
A Stochastic System Model for PageRank: Parameter Estimation and Adaptive Control
A key feature of modern web search engines is the ability to display relevant and reputable pages near the top of the list of query results. The PageRank algorithm provides one way of achieving such a useful hierarchical indexing by assigning a measure of relative importance, called the PageRank value, to each webpage. PageRank is motivated by the inherently hypertextual structure of the World Wide Web; specifically, the idea that pages with more incoming hyperlinks should be considered more popular and that popular pages should rank highly in search results, all other factors being equal. We begin by overviewing the original PageRank algorithm and discussing subsequent developments in the mathematical theory of PageRank. We focus on important contributions to improving the quality of rankings via topic-dependent or "personalized" PageRank, as well as techniques for improving the efficiency of PageRank computation based on Monte Carlo methods, extrapolation and adaptive methods, and aggregation methods We next present a model for PageRank whose dynamics are described by a controlled stochastic system that depends on an unknown parameter. The fact that the value of the parameter is unknown implies that the system is unknown. We establish strong consistency of a least squares estimator for the parameter. Furthermore, motivated by recent work on distributed randomized methods for PageRank computation, we show that the least squares estimator remains strongly consistent within a distributed framework. Finally, we consider the problem of controlling the stochastic system model for PageRank. Under various cost criteria, we use the least squares estimates of the unknown parameter to iteratively construct an adaptive control policy whose performance, according to the long-run average cost, is equivalent to the optimal stationary control that would be used if we had knowledge of the true value of the parameter. This research lays a foundation for future work in a number of areas, including testing the estimation and control procedures on real data or larger scale simulation models, considering more general parameter estimation methods such as weighted least squares, and introducing other types of control policies
A note on certain ergodicity coefficients
We investigate two ergodicity coefficients and ,
originally introduced to bound the subdominant eigenvalues of nonnegative
matrices.
The former has been generalized to complex matrices in recent years and
several properties for such generalized version have been shown so far.
We provide a further result concerning the limit of its powers. Then we
propose a generalization of the second coefficient and we show
that, under mild conditions, it can be used to recast the eigenvector problem
as a particular M-matrix linear system, whose coefficient matrix can be
defined in terms of the entries of . Such property turns out to generalize
the two known equivalent formulations of the Pagerank centrality of a graph
- …