Search CORE

8,582 research outputs found

Building Semantic Corpus from WordNet

Author: Stanchev Lubomir
Publication venue: DigitalCommons@CalPoly
Publication date: 04/10/2012
Field of study

We propose a novel methodology for extracting semantic similarity knowledge from semi-structured sources, such as WordNet. Unlike existing approaches that only explore the structured information (e.g., the hypernym relationship in WordNet), we present a framework that allows us to utilize all available information, including natural language descriptions. Our approach constructs a semantic corpus. It is represented using a graph that models the relationship between phrases using numbers. The data in the semantic corpus can be used to measure the similarity between phrases, the similarity between documents, or to perform a semantic search in a set of documents that uses the meaning of words and phrases (i.e., search that is not keyword-based)

DigitalCommons@CalPoly

The combined Wordnet Bahasa

Author: Bond Francis
Lim Lian Tze
Riza Hammam
Tang Enya Kong
Publication venue
Publication date: 30/09/2014
Field of study

Prometheus-Academic Collections

Building a wordnet for Turkish

Author: Bilgin Orhan
Cetinoglu Ozlem
Oflazer Kemal
Çetinoğlu Özlem
Publication venue: 'American Romanian Academy of Arts and Sciences'
Publication date: 01/12/2004
Field of study

This paper summarizes the development process of a wordnet for Turkish as part of the Balkanet project. After discussing the basic method-ological issues that had to be resolved during the course of the project, the paper presents the basic steps of the construction process in chronological order. Two applications using Turkish wordnet are summarized and links to resources for wordnet builders are provided at the end of the paper

Sabanci University Research Database

Towards a Universal Wordnet by Learning from Combined Evidenc

Author: de Melo G.
Weikum G.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2009
Field of study

Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification

MPG.PuRe