3,446 research outputs found

    Optimal Data Collection For Informative Rankings Expose Well-Connected Graphs

    Get PDF
    Given a graph where vertices represent alternatives and arcs represent pairwise comparison data, the statistical ranking problem is to find a potential function, defined on the vertices, such that the gradient of the potential function agrees with the pairwise comparisons. Our goal in this paper is to develop a method for collecting data for which the least squares estimator for the ranking problem has maximal Fisher information. Our approach, based on experimental design, is to view data collection as a bi-level optimization problem where the inner problem is the ranking problem and the outer problem is to identify data which maximizes the informativeness of the ranking. Under certain assumptions, the data collection problem decouples, reducing to a problem of finding multigraphs with large algebraic connectivity. This reduction of the data collection problem to graph-theoretic questions is one of the primary contributions of this work. As an application, we study the Yahoo! Movie user rating dataset and demonstrate that the addition of a small number of well-chosen pairwise comparisons can significantly increase the Fisher informativeness of the ranking. As another application, we study the 2011-12 NCAA football schedule and propose schedules with the same number of games which are significantly more informative. Using spectral clustering methods to identify highly-connected communities within the division, we argue that the NCAA could improve its notoriously poor rankings by simply scheduling more out-of-conference games.Comment: 31 pages, 10 figures, 3 table

    Analysis of Crowdsourced Sampling Strategies for HodgeRank with Sparse Random Graphs

    Full text link
    Crowdsourcing platforms are now extensively used for conducting subjective pairwise comparison studies. In this setting, a pairwise comparison dataset is typically gathered via random sampling, either \emph{with} or \emph{without} replacement. In this paper, we use tools from random graph theory to analyze these two random sampling methods for the HodgeRank estimator. Using the Fiedler value of the graph as a measurement for estimator stability (informativeness), we provide a new estimate of the Fiedler value for these two random graph models. In the asymptotic limit as the number of vertices tends to infinity, we prove the validity of the estimate. Based on our findings, for a small number of items to be compared, we recommend a two-stage sampling strategy where a greedy sampling method is used initially and random sampling \emph{without} replacement is used in the second stage. When a large number of items is to be compared, we recommend random sampling with replacement as this is computationally inexpensive and trivially parallelizable. Experiments on synthetic and real-world datasets support our analysis

    Average resistance of toroidal graphs

    Get PDF
    The average effective resistance of a graph is a relevant performance index in many applications, including distributed estimation and control of network systems. In this paper, we study how the average resistance depends on the graph topology and specifically on the dimension of the graph. We concentrate on dd-dimensional toroidal grids and we exploit the connection between resistance and Laplacian eigenvalues. Our analysis provides tight estimates of the average resistance, which are key to study its asymptotic behavior when the number of nodes grows to infinity. In dimension two, the average resistance diverges: in this case, we are able to capture its rate of growth when the sides of the grid grow at different rates. In higher dimensions, the average resistance is bounded uniformly in the number of nodes: in this case, we conjecture that its value is of order 1/d1/d for large dd. We prove this fact for hypercubes and when the side lengths go to infinity.Comment: 24 pages, 6 figures, to appear in SIAM Journal on Control and Optimization (SICON

    Distributed Estimation from Relative and Absolute Measurements

    Get PDF
    International audienceThis note defines the problem of least-squares distributed estimation from relative and absolute measurements, by encoding the set of measurements in a weighted undirected graph. The role of its topology is studied by an electrical interpretation, which easily allows distinguishing between topologies that lead to "small" or "large" estimation errors. The least-squares problem is solved by a distributed gradient algorithm: the computed solution is approximately optimal after a number of steps that does not depend on the size of the problem or on the graph-theoretic properties of its encoding. This fact indicates that only a limited cooperation between the sensors is necessary

    Distributed estimation from relative measurements of heterogeneous and uncertain quality

    Get PDF
    This paper studies the problem of estimation from relative measurements in a graph, in which a vector indexed over the nodes has to be reconstructed from pairwise measurements of differences between its components associated to nodes connected by an edge. In order to model heterogeneity and uncertainty of the measurements, we assume them to be affected by additive noise distributed according to a Gaussian mixture. In this original setup, we formulate the problem of computing the Maximum-Likelihood (ML) estimates and we design two novel algorithms, based on Least Squares regression and Expectation-Maximization (EM). The first algorithm (LS- EM) is centralized and performs the estimation from relative measurements, the soft classification of the measurements, and the estimation of the noise parameters. The second algorithm (Distributed LS-EM) is distributed and performs estimation and soft classification of the measurements, but requires the knowledge of the noise parameters. We provide rigorous proofs of convergence of both algorithms and we present numerical experiments to evaluate and compare their performance with classical solutions. The experiments show the robustness of the proposed methods against different kinds of noise and, for the Distributed LS-EM, against errors in the knowledge of noise parameters.Comment: Submitted to IEEE transaction

    Local Difference Measures between Complex Networks for Dynamical System Model Evaluation

    Get PDF
    Acknowledgments We thank Reik V. Donner for inspiring suggestions that initialized the work presented herein. Jan H. Feldhoff is credited for providing us with the STARS simulation data and for his contributions to fruitful discussions. Comments by the anonymous reviewers are gratefully acknowledged as they led to substantial improvements of the manuscript.Peer reviewedPublisher PD
    • …
    corecore