Search CORE

2 research outputs found

Ranking Triples using Entity Links in a Large Web Crawl - The Chicory Triple Scorer at WSDM Cup 2017

Author: Alink Wouter
Cornacchia Roberto
de Vries Arjen P.
Dorssers Frank
Publication venue
Publication date: 22/12/2017
Field of study

This paper describes the participation of team Chicory in the Triple Ranking Challenge of the WSDM Cup 2017. Our approach deploys a large collection of entity tagged web data to estimate the correctness of the relevance relation expressed by the triples, in combination with a baseline approach using Wikipedia abstracts following [1]. Relevance estimations are drawn from ClueWeb12 annotated by Google's entity linker, available publicly as the FACC1 dataset. Our implementation is automatically generated from a so-called 'search strategy' that specifies declaratively how the input data are combined into a final ranking of triples.Comment: Triple Scorer at WSDM Cup 2017, see arXiv:1712.0808

arXiv.org e-Print Archive

Overview of the Triple Scoring Task at the WSDM Cup 2017

Author: Bast Hannah
Buchhold Björn
Haussmann Elmar
Publication venue
Publication date: 21/12/2017
Field of study

This paper provides an overview of the triple scoring task at the WSDM Cup 2017, including a description of the task and the dataset, an overview of the participating teams and their results, and a brief account of the methods employed. In a nutshell, the task was to compute relevance scores for knowledge-base triples from relations, where such scores make sense. Due to the way the ground truth was constructed, scores were required to be integers from the range 0..7. For example, reasonable scores for the triples "Tim Burton profession Director" and "Tim Burton profession Actor" would be 7 and 2, respectively, because Tim Burton is well-known as a director, but he acted only in a few lesser known movies. The triple scoring task attracted considerable interest, with 52 initial registrations and 21 teams who submitted a valid run before the deadline. The winning team achieved an accuracy of 87%, that is, for that fraction of the triples from the test set (which was revealed only after the deadline) the difference to the score from the ground truth was at most 2. The best result for the average difference from the test set scores was 1.50

arXiv.org e-Print Archive