668 research outputs found

    Learning to Learn to Disambiguate: Meta-Learning for Few-Shot Word Sense Disambiguation

    Get PDF
    The success of deep learning methods hinges on the availability of large training datasets annotated for the task of interest. In contrast to human intelligence, these methods lack versatility and struggle to learn and adapt quickly to new tasks, where labeled data is scarce. Meta-learning aims to solve this problem by training a model on a large number of few-shot tasks, with an objective to learn new tasks quickly from a small number of examples. In this paper, we propose a meta-learning framework for few-shot word sense disambiguation (WSD), where the goal is to learn to disambiguate unseen words from only a few labeled instances. Meta-learning approaches have so far been typically tested in an NN-way, KK-shot classification setting where each task has NN classes with KK examples per class. Owing to its nature, WSD deviates from this controlled setup and requires the models to handle a large number of highly unbalanced classes. We extend several popular meta-learning approaches to this scenario, and analyze their strengths and weaknesses in this new challenging setting.Comment: Added additional experiment

    Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss”

    Get PDF
    Mainstream computational lexical semantics embraces the assumption that word senses can be represented as discrete items of a predefined inventory. In this paper we show this needs not be the case, and propose a unified model that is able to produce contextually appropriate definitions. In our model, Generationary, we employ a novel span-based encoding scheme which we use to fine-tune an English pre-trained Encoder-Decoder system to generate glosses. We show that, even though we drop the need of choosing from a predefined sense inventory, our model can be employed effectively: not only does Generationary outperform previous approaches in the generative task of Definition Modeling in many settings, but it also matches or surpasses the state of the art in discriminative tasks such as Word Sense Disambiguation and Word-inContext. Finally, we show that Generationary benefits from training on data from multiple inventories, with strong gains on various zeroshot benchmarks, including a novel dataset of definitions for free adjective-noun phrases. The software and reproduction materials are available at http://generationary.org

    Delving into the uncharted territories of Word Sense Disambiguation

    Get PDF
    The automatic disambiguation of word senses, i.e. Word Sense Disambiguation, is a long-standing task in the field of Natural Language Processing; an AI-complete problem that took its first steps more than half a century ago, and which, to date, has apparently attained human-like performances on standard evaluation benchmarks. Unfortunately, the steady evolution that the task experienced over time in terms of sheer performance has not been followed hand in hand by adequate theoretical support, nor by careful error analysis. Furthermore, we believe that the lack of an exhaustive bird’s eye view which accounts for the sort of high-end and unrealistic computational architectures that systems will soon need in order to further refine their performances could lead the field to a dead angle in a few years. In essence, taking advantage of the current moment of great accomplishments and renewed interest in the task, we argue that Word Sense Disambiguation is mature enough for researchers to really observe the extent of the results hitherto obtained, evaluate what is actually missing, and answer the much sought for question: “are current state-of-the-art systems really able to effectively solve lexical ambiguity?” Driven by the desire to become both architects and participants in this period of pondering, we have identified a few macro-areas representatives of the challenges of automatic disambiguation. From this point of view, in this thesis, we propose experimental solutions and empirical tools so as to bring to the attention of the Word Sense Disambiguation community unusual and unexplored points of view. We hope these will represent a new perspective through which to best observe the current state of disambiguation, as well as to foresee future paths for the task to evolve on. Specifically, 1q) prompted by the growing concern about the rise in performance being closely linked to the demand for more and more unrealistic computational architectures in all areas of application of Deep Learning related techniques, we 1a) provide evidence for the undisclosed potential of approaches based on knowledge-bases, via the exploitation of syntagmatic information. Moreover, 2q) driven by the dissatisfaction with the use of cognitively-inaccurate, finite inventories of word senses in Word Sense Disambiguation, we 2a) introduce an approach based on Definition Modeling paradigms to generate contextual definitions for target words and phrases, hence going beyond the limits set by specific lexical-semantic inventories. Finally, 3q) moved by the desire to analyze the real implications beyond the idea of “machines performing disambiguation on par with their human counterparts” we 3a) put forward a detailed analysis of the shared errors affecting current state-of-the-art systems based on diverse approaches for Word Sense Disambiguation, and highlight, by means of a novel evaluation dataset tailored to represent common and critical issues shared by all systems, performances way lower than those usually reported in the current literature

    Word sense extension

    Full text link
    Humans often make creative use of words to express novel senses. A long-standing effort in natural language processing has been focusing on word sense disambiguation (WSD), but little has been explored about how the sense inventory of a word may be extended toward novel meanings. We present a paradigm of word sense extension (WSE) that enables words to spawn new senses toward novel context. We develop a framework that simulates novel word sense extension by first partitioning a polysemous word type into two pseudo-tokens that mark its different senses, and then inferring whether the meaning of a pseudo-token can be extended to convey the sense denoted by the token partitioned from the same word type. Our framework combines cognitive models of chaining with a learning scheme that transforms a language model embedding space to support various types of word sense extension. We evaluate our framework against several competitive baselines and show that it is superior in predicting plausible novel senses for over 7,500 English words. Furthermore, we show that our WSE framework improves performance over a range of transformer-based WSD models in predicting rare word senses with few or zero mentions in the training data
    • …
    corecore