Več glav več ve

Abstract

V prispevku predstavljamo projekt čiščenja avtomatsko generiranega semantičnega leksikona sloWNet. Napake, ki se v leksikonu pojavljajo zaradi napačne avtomatske disambiguacije večpomenskih besed, smo odpravili s pomočjo orodja sloWCrowd, ki je zasnovano tako, da odgovore za problematične literale zbira iz široke množice uporabnikov - prostovoljcev. Naloga je oblikovana kot spletna igra, v kateri uporabniki tekmujejo, kdo bo zbral več točk (prispeval več pravilnih odgovorov). Glede na to, da tekmovalci niso izurjeni leksikografi, njihovi odgovori niso nujno zanesljivi, zato orodje omogoča merjenje njihove natan~nosti in pri vsakem vprašanju upošteva večinski odgovor, s čimer zagotavlja, da posamezni napačni odgovori sicer zanesljivih uporabnikov ter vsi odgovori nezanesljivih uporabnikov ne vplivajo na dokončno odločitev, ali se določen literal iz leksikona izbriše ali ne.The paper presents the cleaning of the automatically generated semantic lexicon sloWNet. Errors that occurred due to inappropriate disambiguation of polysemous words were eliminated with a tool called sloWCrowd, which is designed in such a way that it collects multiple answers for problematic literals from a wide number of volunteer users. The task is designed as a web game in which users compete who will collect the highest number of points (contribute the most correct answers). Since the users are not trained lexicographers, the reliability of their answers is questionable, which is whythe tool has been designed to measure the usersʼ accuracy and relies on themajority vote for each literal. This means that the individual incorrect answers from otherwise reliable users and all the answers from unreliable users do not affect the final decision whether or not the literal is to be deleted from the lexicon

    Similar works