387 research outputs found

    Using WordNet for Building WordNets

    Full text link
    This paper summarises a set of methodologies and techniques for the fast construction of multilingual WordNets. The English WordNet is used in this approach as a backbone for Catalan and Spanish WordNets and as a lexical knowledge resource for several subtasks.Comment: 8 pages, postscript file. In workshop on Usage of WordNet in NL

    Towards a Universal Wordnet by Learning from Combined Evidenc

    Get PDF
    Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification

    Combining Multiple Methods for the Automatic Construction of Multilingual WordNets

    Full text link
    This paper explores the automatic construction of a multilingual Lexical Knowledge Base from preexisting lexical resources. First, a set of automatic and complementary techniques for linking Spanish words collected from monolingual and bilingual MRDs to English WordNet synsets are described. Second, we show how resulting data provided by each method is then combined to produce a preliminary version of a Spanish WordNet with an accuracy over 85%. The application of these combinations results on an increment of the extracted connexions of a 40% without losing accuracy. Both coarse-grained (class level) and fine-grained (synset assignment level) confidence ratios are used and evaluated. Finally, the results for the whole process are presented.Comment: 7 pages, 4 postscript figure

    Linking a domain thesaurus to WordNet and conversion to WordNet-LMF

    Get PDF
    We present a methodology to link domain thesauri to general-domain lexica. This is applied in the framework of the KYOTO project to link the Species2000 thesaurus to the synsets of the English WordNet. Moreover, we study the formalisation of this thesaurus according to the ISO LMF standard and its dialect WordNet-LMF. This conversion will allow Species2000 to communicate with the other resources available in the KYOTO architecture.Peer ReviewedPostprint (published version

    Towards a universal index of meaning

    Get PDF
    The Inter-Lingual-Index (ILI) in the EuroWordNet architecture is an initially unstructured fund of concepts which functions as the link between the various language wordnets.The ILI concepts originate from WordNet1.5, and have been restructured on the basis of aspects of the internal structure of Word-Net,links between WordNet and other resources,and multilingual mapping between the wordnets. This leads to a differentiation of the status of ILI concepts,a reduction of the Wordnet polysemy,and a greater connectivity between the wordnets. The restructured ILI represents the first step towards a standardized set of word meanings,is a working platform for further development and testing,and can be put to use in NLP tasks such as (multilingual)information retrieval
    • …
    corecore