193,698 research outputs found

    Learning Relatedness Measures for Entity Linking

    Get PDF
    Entity Linking is the task of detecting, in text documents, relevant mentions to entities of a given knowledge base. To this end, entity-linking algorithms use several signals and features extracted from the input text or from the knowl- edge base. The most important of such features is entity relatedness. Indeed, we argue that these algorithms benefit from maximizing the relatedness among the relevant enti- ties selected for annotation, since this minimizes errors in disambiguating entity-linking. The definition of an e↵ective relatedness function is thus a crucial point in any entity-linking algorithm. In this paper we address the problem of learning high-quality entity relatedness functions. First, we formalize the problem of learning entity relatedness as a learning-to-rank problem. We propose a methodology to create reference datasets on the basis of manually annotated data. Finally, we show that our machine-learned entity relatedness function performs better than other relatedness functions previously proposed, and, more importantly, improves the overall performance of dif- ferent state-of-the-art entity-linking algorithms

    Factorizing lexical relatedness

    Get PDF

    Graphics for relatedness research

    Get PDF
    Studies of relatedness have been crucial in molecular ecology over the last decades. Good evidence of this is the fact that studies of population structure, evolution of social behaviours, genetic diversity and quantitative genetics all involve relatedness research. The main aim of this article is to review the most common graphical methods used in allele sharing studies for detecting and identifying family relationships. Both IBS and IBD based allele sharing studies are considered. Furthermore, we propose two additional graphical methods from the field of compositional data analysis: the ternary diagram and scatterplots of isometric log-ratios of IBS and IBD probabilities. We illustrate all graphical tools with genetic data from the HGDP-CEPH diversity panel, using mainly 377 microsatellites genotyped for 25 individuals from the Maya population of this panel. We enhance all graphics with convex hulls obtained by simulation and use these to confirm the documented relationships. The proposed compositional graphics are shown to be useful in relatedness research, as they also single out the most prominent related pairs. The ternary diagram is advocated for its ability to display all three allele sharing probabilities simultaneously. The log-ratio plots are advocated as an attempt to overcome the problems with the Euclidean distance interpretation in the classical graphics.Peer ReviewedPostprint (published version

    War and Relatedness

    Get PDF
    We develop a theory of interstate conflict in which the degree of genealogical relatedness between populations has a positive effect on their conflict propensities because more closely related populations, on average, tend to interact more and develop more disputes over sets of common issues. We examine the empirical relationship between the occurrence of interstate conflicts and the degree of relatedness between countries, showing that populations that are genetically closer are more than prone to go to war with each other, even after controlling for a wide set of measures of geographic distance and other factors that affect conflict, including measure of trade and democracy.

    War and relatedness

    Get PDF
    We examine the empirical relationship between the occurrence of inter-state conflicts and the degree of relatedness between countries, measured by genetic distance. We find that populations that are genetically closer are more prone to go to war with each other, even after controlling for numerous measures of geographic distance and other factors that affect conflict, including measures of trade and democracy. These findings are consistent with a framework in which conflict over rival and excludable goods (such as territory and resources) is more likely among populations that share more similar preferences, and inherit such preferences with variation from their ancestors

    Revealed Relatedness: Mapping Industry Space

    Get PDF
    In this paper we measure technological relatedness between industries using a dataset on product portfolios of plants. For this purpose we first develop a general methodology to extract data on co-occurrences of classes (e.g. industries) in a single entity (e.g. a plant) to construct estimates of the relatedness between the classes. The core assumption, in line with the concept of economies of scope, is that if two products are produced in the same plant, this is an indication of relatedness between the industries the two products are a part of. Unlike earlier methods, we arrive at a Revealed Relatedness (RR) index that can be interpreted on a ratio scale, allows for the use of indirect (i.e. not directly observed) information on industry relatedness, and conceptualizes relatedness as being asymmetric or directed. Direction of relatedness provides information on, for example, the most likely direction of spillovers between two classes. We also graph the RR matrices using methods borrowed from social network analysis. The result is a visualization of the “industry space” and how that changes over time with structural transformation of the economy. In order to test the validity of the framework, the industry space is used to plot structural transformation paths of regions. It is shown that the RR matrix indeed has significant explanatory power for the composition and change of a regions portfolio of manufacturing industries, in spite of the fact that regional information played no role in its derivation. This confirms the quality of our RR estimates.technological relatedness, industry relations, industry space, revealed relatedness
    • …
    corecore