6,662 research outputs found

    One Sense per Context Cluster: Improving Word Sense Disambiguation Using Web-Scale Phrase Clustering

    Get PDF
    Abstract-The performance of word sense disambiguation task is still limited by lexical context matching due to data sparse problem. In this paper we present a simple but effective method that incorporates web-scale phrase clustering results for context matching. This method is able to capture some semantic relations that are not in WordNet. Without using any additional labeled data this new approach obtained 2.11%-6.92% higher accuracy over a typical supervised classifier

    Approche supervisée à base de cellules LSTM bidirectionnelles pour la désambiguïsation lexicale

    Get PDF
    International audienceIn word sense disambiguation, there are still few usages of neural networks. This direction is very promiseful however, the results obtained by these first systems being systematically in the top of the evaluation campaigns, with an improvement gap which seems still high. We present in this paper a new architecture based on neural networks for word sense disambiguation. Our system is at the same time less difficult to train than existing neural networks, and it obtains state of the art results on most evaluation tasks in English. The focus is on the reproducibility of our systems and our results, through the use of a word embeddings model, training corpora and evaluation corpora freely accessible.En désambiguïsation lexicale, l'utilisation des réseaux de neurones est encore peu présente et très récente. Cette direction est pourtant très prometteuse, tant les résultats obtenus par ces premiers systèmes arrivent systématiquement en tête des campagnes d'évaluation, malgré une marge d'amé-lioration qui semble encore importante. Nous présentons dans cet article une nouvelle architecture à base de réseaux de neurones pour la désambiguïsation lexicale. Notre système est à la fois moins complexe à entraîner que les systèmes neuronaux existants et il obtient des résultats état de l'art sur la plupart des tâches d'évaluation de la désambiguïsation lexicale en anglais. L'accent est porté sur la reproductibilité de notre système et de nos résultats, par l'utilisation d'un modèle de vecteurs de mots, de corpus d'apprentissage et d'évaluation librement accessibles. ABSTRACT LSTM Based Supervised Approach for Word Sense Disambiguation In word sense disambiguation, there are still few usages of neural networks. This direction is very promiseful however, the results obtained by these first systems being systematically in the top of the evaluation campaigns, with an improvement gap which seems still high. We present in this paper a new architecture based on neural networks for word sense disambiguation. Our system is at the same time less difficult to train than existing neural networks, and it obtains state of the art results on most evaluation tasks in English. The focus is on the reproducibility of our systems and our results, through the use of a word embeddings model, training corpora and evaluation corpora freely accessible. MOTS-CLÉS : Désambiguïsation lexicale, Approche supervisée, LSTM, Réseau neuronal

    Embeddings for word sense disambiguation: an evaluation study

    Get PDF
    Recent years have seen a dramatic growth in the popularity of word embeddings mainly owing to their ability to capture semantic information from massive amounts of textual content. As a result, many tasks in Natural Language Processing have tried to take advantage of the potential of these distributional models. In this work, we study how word embeddings can be used in Word Sense Disambiguation, one of the oldest tasks in Natural Language Processing and Artificial Intelligence. We propose different methods through which word embeddings can be leveraged in a state-of-the-art supervised WSD system architecture, and perform a deep analysis of how different parameters affect performance. We show how a WSD system that makes use of word embeddings alone, if designed properly, can provide significant performance improvement over a state-of-the-art WSD system that incorporates several standard WSD features

    Huge automatically extracted training sets for multilingual Word Sense Disambiguation

    Get PDF
    We release to the community six large-scale sense-annotated datasets in multiple language to pave the way for supervised multilingual Word Sense Disambiguation. Our datasets cover all the nouns in the English WordNet and their translations in other languages for a total of millions of sense-tagged sentences. Experiments prove that these corpora can be effectively used as training sets for supervised WSD systems, surpassing the state of the art for low- resourced languages and providing competitive results for English, where manually annotated training sets are accessible. The data is available at trainomatic. org
    • …
    corecore