19,234 research outputs found
Towards a Universal Wordnet by Learning from Combined Evidenc
Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification
RankMerging: A supervised learning-to-rank framework to predict links in large social network
Uncovering unknown or missing links in social networks is a difficult task
because of their sparsity and because links may represent different types of
relationships, characterized by different structural patterns. In this paper,
we define a simple yet efficient supervised learning-to-rank framework, called
RankMerging, which aims at combining information provided by various
unsupervised rankings. We illustrate our method on three different kinds of
social networks and show that it substantially improves the performances of
unsupervised metrics of ranking. We also compare it to other combination
strategies based on standard methods. Finally, we explore various aspects of
RankMerging, such as feature selection and parameter estimation and discuss its
area of relevance: the prediction of an adjustable number of links on large
networks.Comment: 43 pages, published in Machine Learning Journa
- …