26 research outputs found
Using WordNet for Building WordNets
This paper summarises a set of methodologies and techniques for the fast
construction of multilingual WordNets. The English WordNet is used in this
approach as a backbone for Catalan and Spanish WordNets and as a lexical
knowledge resource for several subtasks.Comment: 8 pages, postscript file. In workshop on Usage of WordNet in NL
Building a free French wordnet from multilingual resources
International audienceThis paper describes automatic construction a freely-available wordnet for French (WOLF) based on Princeton WordNet (PWN) by using various multilingual resources. Polysemous words were dealt with an approach in which a parallel corpus for five languages was word-aligned and the extracted multilingual lexicon was disambiguated with the existing wordnets for these languages. On the other hand, a bilingual approach sufficed to acquire equivalents for monosemous words. Bilingual lexicons were extracted from Wikipedia and thesauri. The results obtained from each resource were merged and ranked according to the number of resources yielding the same literal. Automatic evaluation of the merged wordnet was performed with the French WordNet (FREWN). Manual evaluation was also carried out on a sample of the generated synsets. Precision shows that the presented approach has proved to be very promising and applications to use the created wordnet are already intended
Pattern-Based Acquisition of Scientific Entities from Scholarly Article Titles
We describe a rule-based approach for the automatic acquisition of salient scientific entities from Computational Linguistics (CL) scholarly article titles. Two observations motivated the approach: (i) noting salient aspects of an article’s contribution in its title; and (ii) pattern regularities capturing the salient terms that could be expressed in a set of rules. Only those lexico-syntactic patterns were selected that were easily recognizable, occurred frequently, and positionally indicated a scientific entity type. The rules were developed on a collection of 50,237 CL titles covering all articles in the ACL Anthology. In total, 19,799 research problems, 18,111 solutions, 20,033 resources, 1,059 languages, 6,878 tools, and 21,687 methods were extracted at an average precision of 75%
Magyár főnévi WordNet-ontológia létrehozása automatikus módszerekkel
A cikk bemutatja a folyamatban lĂ©vĹ‘, magyar fĹ‘nĂ©vi WordNet adatbázis lĂ©trehozását cĂ©lul kitűzĹ‘ munkálatok mĂłdszereit Ă©s legfrissebb eredmĂ©nyeit. Bemutatjuk azt a 9 kĂĽlönbözĹ‘ számĂtĂłgĂ©pes mĂłdszert, melyek cĂ©lja magyar fĹ‘nevek automatizált hozzárendelĂ©se az angol nyelvű, 1.6-os verziĂłjĂş WordNet synsetjeihez. A felhasznált magyar fĹ‘nevek egy elektronikus magyarangol kĂ©tnyelvű szĂłtár szĂłanyagábĂłl származnak. A heurisztikus hozzárendelĂ©sek támogatásához a kĂ©tnyelvű mellett az egynyelvű magyar ÉrtelmezĹ‘ KĂ©ziszĂłtár számĂtĂłgĂ©ppel feldolgozhatĂł anyagábĂłl nyertĂĽnk ki strukturális Ă©s szemantikai informáciĂłkat. A kĂĽlönbözĹ‘ folyamatok eredmĂ©nyeinek pontosságát egy kĂ©zzel egyĂ©rtelműsitett etalon halmaz segĂtsĂ©gĂ©vel becsĂĽltĂĽk meg, majd a fĹ‘nĂ©vi adatbázist a validált eredmĂ©nyhalmazok kĂĽlönbözĹ‘ szintű pontosságot meghaladĂł kombináciĂłival állĂtottuk elĹ‘