3,446 research outputs found
Optimal Data Collection For Informative Rankings Expose Well-Connected Graphs
Given a graph where vertices represent alternatives and arcs represent
pairwise comparison data, the statistical ranking problem is to find a
potential function, defined on the vertices, such that the gradient of the
potential function agrees with the pairwise comparisons. Our goal in this paper
is to develop a method for collecting data for which the least squares
estimator for the ranking problem has maximal Fisher information. Our approach,
based on experimental design, is to view data collection as a bi-level
optimization problem where the inner problem is the ranking problem and the
outer problem is to identify data which maximizes the informativeness of the
ranking. Under certain assumptions, the data collection problem decouples,
reducing to a problem of finding multigraphs with large algebraic connectivity.
This reduction of the data collection problem to graph-theoretic questions is
one of the primary contributions of this work. As an application, we study the
Yahoo! Movie user rating dataset and demonstrate that the addition of a small
number of well-chosen pairwise comparisons can significantly increase the
Fisher informativeness of the ranking. As another application, we study the
2011-12 NCAA football schedule and propose schedules with the same number of
games which are significantly more informative. Using spectral clustering
methods to identify highly-connected communities within the division, we argue
that the NCAA could improve its notoriously poor rankings by simply scheduling
more out-of-conference games.Comment: 31 pages, 10 figures, 3 table
Analysis of Crowdsourced Sampling Strategies for HodgeRank with Sparse Random Graphs
Crowdsourcing platforms are now extensively used for conducting subjective
pairwise comparison studies. In this setting, a pairwise comparison dataset is
typically gathered via random sampling, either \emph{with} or \emph{without}
replacement. In this paper, we use tools from random graph theory to analyze
these two random sampling methods for the HodgeRank estimator. Using the
Fiedler value of the graph as a measurement for estimator stability
(informativeness), we provide a new estimate of the Fiedler value for these two
random graph models. In the asymptotic limit as the number of vertices tends to
infinity, we prove the validity of the estimate. Based on our findings, for a
small number of items to be compared, we recommend a two-stage sampling
strategy where a greedy sampling method is used initially and random sampling
\emph{without} replacement is used in the second stage. When a large number of
items is to be compared, we recommend random sampling with replacement as this
is computationally inexpensive and trivially parallelizable. Experiments on
synthetic and real-world datasets support our analysis
Average resistance of toroidal graphs
The average effective resistance of a graph is a relevant performance index
in many applications, including distributed estimation and control of network
systems. In this paper, we study how the average resistance depends on the
graph topology and specifically on the dimension of the graph. We concentrate
on -dimensional toroidal grids and we exploit the connection between
resistance and Laplacian eigenvalues. Our analysis provides tight estimates of
the average resistance, which are key to study its asymptotic behavior when the
number of nodes grows to infinity. In dimension two, the average resistance
diverges: in this case, we are able to capture its rate of growth when the
sides of the grid grow at different rates. In higher dimensions, the average
resistance is bounded uniformly in the number of nodes: in this case, we
conjecture that its value is of order for large . We prove this fact
for hypercubes and when the side lengths go to infinity.Comment: 24 pages, 6 figures, to appear in SIAM Journal on Control and
Optimization (SICON
Distributed Estimation from Relative and Absolute Measurements
International audienceThis note defines the problem of least-squares distributed estimation from relative and absolute measurements, by encoding the set of measurements in a weighted undirected graph. The role of its topology is studied by an electrical interpretation, which easily allows distinguishing between topologies that lead to "small" or "large" estimation errors. The least-squares problem is solved by a distributed gradient algorithm: the computed solution is approximately optimal after a number of steps that does not depend on the size of the problem or on the graph-theoretic properties of its encoding. This fact indicates that only a limited cooperation between the sensors is necessary
Distributed estimation from relative measurements of heterogeneous and uncertain quality
This paper studies the problem of estimation from relative measurements in a
graph, in which a vector indexed over the nodes has to be reconstructed from
pairwise measurements of differences between its components associated to nodes
connected by an edge. In order to model heterogeneity and uncertainty of the
measurements, we assume them to be affected by additive noise distributed
according to a Gaussian mixture. In this original setup, we formulate the
problem of computing the Maximum-Likelihood (ML) estimates and we design two
novel algorithms, based on Least Squares regression and
Expectation-Maximization (EM). The first algorithm (LS- EM) is centralized and
performs the estimation from relative measurements, the soft classification of
the measurements, and the estimation of the noise parameters. The second
algorithm (Distributed LS-EM) is distributed and performs estimation and soft
classification of the measurements, but requires the knowledge of the noise
parameters. We provide rigorous proofs of convergence of both algorithms and we
present numerical experiments to evaluate and compare their performance with
classical solutions. The experiments show the robustness of the proposed
methods against different kinds of noise and, for the Distributed LS-EM,
against errors in the knowledge of noise parameters.Comment: Submitted to IEEE transaction
Local Difference Measures between Complex Networks for Dynamical System Model Evaluation
Acknowledgments We thank Reik V. Donner for inspiring suggestions that initialized the work presented herein. Jan H. Feldhoff is credited for providing us with the STARS simulation data and for his contributions to fruitful discussions. Comments by the anonymous reviewers are gratefully acknowledged as they led to substantial improvements of the manuscript.Peer reviewedPublisher PD
- …