Word Sense Discrimination: A Gangplank Algorithm

Abstract

L\u2019obiettivo di questo articolo \ue8 descrivere un approccio di clustering non supervisionato e basato su grafi per individuare e discriminare i differenti sensi che un termine pu\uf2 assumere all\u2019interno di un testo. Partendo da un grafo di cooccorrenze, vi definiamo una distanza fra nodi e applichiamo un algoritmo basato sulle \u201cpasserelle\u201d, cio\ue8 archi che separano regioni dense (\u201cisole\u201d) all\u2019interno del grafo. Discutiamo i risultati ottenuti su un insieme di dati composto da tweet.In this paper we present an unsupervised, graph-based approach for Word Sense Discrimination. Given a set of text sentences, a word co-occurrence graph is derived and a distance based on Jaccard index is defined on it; subsequently, the new distance is used to cluster the neighbour nodes of ambiguous terms using the concept of \u201cgangplanks\u201d as edges that separate denser regions (\u201cislands\u201d) in the graph. The proposed approach has been evaluated on a real data set, showing promising performance in Word Sense Discrimination

    Similar works

    Full text

    thumbnail-image

    Available Versions