137 research outputs found

    The importance of Perron-Frobenius Theorem in ranking problems

    Get PDF
    The problem of ranking a set of elements, namely giving a ``rank'' to the elements of the set, may be tackled in many different ways. In particular a mathematically based ranking scheme can be used and sometimes it may be interesting to see how different can be the results of a mathematically based method compared with some more heuristic ways. In this working paper some remarks are presented about the importance, in a mathematical approach to ranking schemes, of a classical result from Linear Algebra, the Perron--Frobenius theorem. To give a motivation of such an importance two different contexts are taken into account, where a ranking problem arises: the example of ranking football/soccer teams and the one of ranking webpages in the approach proposed and implemented by Google's PageRank algorithm

    PageRank: Standing on the shoulders of giants

    Full text link
    PageRank is a Web page ranking technique that has been a fundamental ingredient in the development and success of the Google search engine. The method is still one of the many signals that Google uses to determine which pages are most important. The main idea behind PageRank is to determine the importance of a Web page in terms of the importance assigned to the pages hyperlinking to it. In fact, this thesis is not new, and has been previously successfully exploited in different contexts. We review the PageRank method and link it to some renowned previous techniques that we have found in the fields of Web information retrieval, bibliometrics, sociometry, and econometrics

    The algebraic approach to some ranking problems

    Get PDF
    The problem of ranking a set of elements, namely giving a \u201crank\u201d to the elements of the set, may arise in very different contexts and may be handled in some possible different ways, depending on the ways these elements are set in competition the ones against the others. For example there are contexts in which we deal with an even paired competition, in the sense the pairings are evenly matched: if we think for example of a national soccer championship, each team is paired with every other team the same number of times. Sometimes we may deal with an uneven paired competition: think for example of the UEFA Champions League, in which the pairings are not fully covered, but just some pairings are set, by means of a random selection process for example. Mathematically based ranking schemes can be used and may show interesting connections between the ranking problems and classical theoretical results. In this working paper we first show how a linear scheme in the ranking process directly takes to some fundamental Linear Algebra concepts and results, mainly the eigenvalues and eigenvectors of linear transformations and Perron\u2013Frobenius theorem. We apply also the linear ranking model to a numerical simulation taking the data from the Italian soc- cer championship 2015-2016. We finally point out some interesting differences in the final ranking by comparing the actual placements of the teams at the end of the contest with the mathematical scores provided to teams by the theoretical model

    Convergence of Tomlin's HOTS algorithm

    Full text link
    The HOTS algorithm uses the hyperlink structure of the web to compute a vector of scores with which one can rank web pages. The HOTS vector is the vector of the exponentials of the dual variables of an optimal flow problem (the "temperature" of each page). The flow represents an optimal distribution of web surfers on the web graph in the sense of entropy maximization. In this paper, we prove the convergence of Tomlin's HOTS algorithm. We first study a simplified version of the algorithm, which is a fixed point scaling algorithm designed to solve the matrix balancing problem for nonnegative irreducible matrices. The proof of convergence is general (nonlinear Perron-Frobenius theory) and applies to a family of deformations of HOTS. Then, we address the effective HOTS algorithm, designed by Tomlin for the ranking of web pages. The model is a network entropy maximization problem generalizing matrix balancing. We show that, under mild assumptions, the HOTS algorithm converges with a linear convergence rate. The proof relies on a uniqueness property of the fixed point and on the existence of a Lyapunov function. We also show that the coordinate descent algorithm can be used to find the ideal and effective HOTS vectors and we compare HOTS and coordinate descent on fragments of the web graph. Our numerical experiments suggest that the convergence rate of the HOTS algorithm may deteriorate when the size of the input increases. We thus give a normalized version of HOTS with an experimentally better convergence rate.Comment: 21 page

    An Oracle Method to Predict NFL Games

    Get PDF
    Multiple models are discussed for ranking teams in a league and introduce a new model called the Oracle method. This is a Markovian method that can be customized to incorporate multiple team traits into its ranking. Using a foresight prediction of NFL game outcomes for the 2002–2013 seasons, it is shown that the Oracle method correctly picked 64.1% of the games under consideration, which is higher than any of the methods compared, including ESPN Power Rankings, Massey, Colley, and PageRank

    Perron vector optimization applied to search engines

    Full text link
    In the last years, Google's PageRank optimization problems have been extensively studied. In that case, the ranking is given by the invariant measure of a stochastic matrix. In this paper, we consider the more general situation in which the ranking is determined by the Perron eigenvector of a nonnegative, but not necessarily stochastic, matrix, in order to cover Kleinberg's HITS algorithm. We also give some results for Tomlin's HOTS algorithm. The problem consists then in finding an optimal outlink strategy subject to design constraints and for a given search engine. We study the relaxed versions of these problems, which means that we should accept weighted hyperlinks. We provide an efficient algorithm for the computation of the matrix of partial derivatives of the criterion, that uses the low rank property of this matrix. We give a scalable algorithm that couples gradient and power iterations and gives a local minimum of the Perron vector optimization problem. We prove convergence by considering it as an approximate gradient method. We then show that optimal linkage stategies of HITS and HOTS optimization problems verify a threshold property. We report numerical results on fragments of the real web graph for these search engine optimization problems.Comment: 28 pages, 5 figure

    Optimal Data Collection For Informative Rankings Expose Well-Connected Graphs

    Get PDF
    Given a graph where vertices represent alternatives and arcs represent pairwise comparison data, the statistical ranking problem is to find a potential function, defined on the vertices, such that the gradient of the potential function agrees with the pairwise comparisons. Our goal in this paper is to develop a method for collecting data for which the least squares estimator for the ranking problem has maximal Fisher information. Our approach, based on experimental design, is to view data collection as a bi-level optimization problem where the inner problem is the ranking problem and the outer problem is to identify data which maximizes the informativeness of the ranking. Under certain assumptions, the data collection problem decouples, reducing to a problem of finding multigraphs with large algebraic connectivity. This reduction of the data collection problem to graph-theoretic questions is one of the primary contributions of this work. As an application, we study the Yahoo! Movie user rating dataset and demonstrate that the addition of a small number of well-chosen pairwise comparisons can significantly increase the Fisher informativeness of the ranking. As another application, we study the 2011-12 NCAA football schedule and propose schedules with the same number of games which are significantly more informative. Using spectral clustering methods to identify highly-connected communities within the division, we argue that the NCAA could improve its notoriously poor rankings by simply scheduling more out-of-conference games.Comment: 31 pages, 10 figures, 3 table
    • …
    corecore