13,532 research outputs found

    A Weighted Correlation Index for Rankings with Ties

    Full text link
    Understanding the correlation between two different scores for the same set of items is a common problem in information retrieval, and the most commonly used statistics that quantifies this correlation is Kendall's τ\tau. However, the standard definition fails to capture that discordances between items with high rank are more important than those between items with low rank. Recently, a new measure of correlation based on average precision has been proposed to solve this problem, but like many alternative proposals in the literature it assumes that there are no ties in the scores. This is a major deficiency in a number of contexts, and in particular while comparing centrality scores on large graphs, as the obvious baseline, indegree, has a very large number of ties in web and social graphs. We propose to extend Kendall's definition in a natural way to take into account weights in the presence of ties. We prove a number of interesting mathematical properties of our generalization and describe an O(nlogn)O(n\log n) algorithm for its computation. We also validate the usefulness of our weighted measure of correlation using experimental data

    A theory on power in networks

    Full text link
    The eigenvector centrality equation λx=Ax\lambda x = A \, x is a successful compromise between simplicity and expressivity. It claims that central actors are those connected with central others. For at least 70 years, this equation has been explored in disparate contexts, including econometrics, sociometry, bibliometrics, Web information retrieval, and network science. We propose an equally elegant counterpart: the power equation x=Ax÷x = A x^{\div}, where x÷x^{\div} is the vector whose entries are the reciprocal of those of xx. It asserts that power is in the hands of those connected with powerless others. It is meaningful, for instance, in bargaining situations, where it is advantageous to be connected to those who have few options. We tell the parallel, mostly unexplored story of this intriguing equation

    COMPARING THE EFFECTIVENESS OF RANK CORRELATION STATISTICS

    Get PDF
    Rank correlation is a fundamental tool to express dependence in cases in which the data are arranged in order. There are, by contrast, circumstances where the ordinal association is of a nonlinear type. In this paper we investigate the effectiveness of several measures of rank correlation. These measures have been divided into three classes: conventional rank correlations, weighted rank correlations, correlations of scores. Our findings suggest that none is systematically better than the other in all circumstances. However, a simply weighted version of the Kendall rank correlation coefficient provides plausible answers to many special situations where intercategory distances could not be considered on the same basis.Ordinal Data, Nonlinear Association, Weighted Rank Correlation

    Ranking Portfolio Performance: An Application of a Joint Means and Variances Equality Test

    Get PDF
    We propose a new procedure to rank portfolio performance. Given a set of N portfolios, we use statistical tests of dominance which produce direct mean-variance comparisons between any two portfolios in the set. These tests yield an NxN matrix of pairwise comparisons. A ranking function maps the elements of the comparison matrix into a numerical ranking. To illustrate the procedure we use a set of 133 mutual funds, including the S&P500 index and the CRSP equal and value weighted indexes. We explore the empirical and theoretical relationships between our ranking procedure and the Treynor, Sharpe and Jensen performance measures. In general, the new procedure?s ranking is relatively robust, does not allow for gaming and can be performed with small samples.

    AN EXHAUSTIVE COEFFICIENT OF RANK CORRELATION

    Get PDF
    Rank association is a fundamental tool for expressing dependence in cases in which data are arranged in order. Measures of rank correlation have been accumulated in several contexts for more than a century and we were able to cite more than thirty of these coefficients, from simple ones to relatively complicated definitions invoking one or more systems of weights. However, only a few of these can actually be considered to be admissible substitutes for Pearson’s correlation. The main drawback with the vast majority of coefficients is their “resistance-tochange” which appears to be of limited value for the purposes of rank comparisons that are intrinsically robust. In this article, a new nonparametric correlation coefficient is defined that is based on the principle of maximization of a ratio of two ranks. In comparing it with existing rank correlations, it was found to have extremely high sensitivity to permutation patterns. We have illustrated the potential improvement that our index can provide in economic contexts by comparing published results with those obtained through the use of this new index. The success that we have had suggests that our index may have important applications wherever the discriminatory power of the rank correlation coefficient should be particularly strong.Ordinal data, Nonparametric agreement, Economic applications

    Structure of Heterogeneous Networks

    Full text link
    Heterogeneous networks play a key role in the evolution of communities and the decisions individuals make. These networks link different types of entities, for example, people and the events they attend. Network analysis algorithms usually project such networks unto simple graphs composed of entities of a single type. In the process, they conflate relations between entities of different types and loose important structural information. We develop a mathematical framework that can be used to compactly represent and analyze heterogeneous networks that combine multiple entity and link types. We generalize Bonacich centrality, which measures connectivity between nodes by the number of paths between them, to heterogeneous networks and use this measure to study network structure. Specifically, we extend the popular modularity-maximization method for community detection to use this centrality metric. We also rank nodes based on their connectivity to other nodes. One advantage of this centrality metric is that it has a tunable parameter we can use to set the length scale of interactions. By studying how rankings change with this parameter allows us to identify important nodes in the network. We apply the proposed method to analyze the structure of several heterogeneous networks. We show that exploiting additional sources of evidence corresponding to links between, as well as among, different entity types yields new insights into network structure
    corecore