research

Using Explicit Semantic Analysis for Cross-Lingual Link Discovery

Abstract

This paper explores how to automatically generate cross language links between resources in large document collections. The paper presents new methods for Cross Lingual Link Discovery(CLLD) based on Explicit Semantic Analysis (ESA). The methods are applicable to any multilingual document collection. In this report, we present their comparative study on the Wikipedia corpus and provide new insights into the evaluation of link discovery systems. In particular, we measure the agreement of human annotators in linking articles in different language versions of Wikipedia, and compare it to the results achieved by the presented methods

    Similar works