475 research outputs found

    Substitution-based Semantic Change Detection using Contextual Embeddings

    Full text link
    Measuring semantic change has thus far remained a task where methods using contextual embeddings have struggled to improve upon simpler techniques relying only on static word vectors. Moreover, many of the previously proposed approaches suffer from downsides related to scalability and ease of interpretation. We present a simplified approach to measuring semantic change using contextual embeddings, relying only on the most probable substitutes for masked terms. Not only is this approach directly interpretable, it is also far more efficient in terms of storage, achieves superior average performance across the most frequently cited datasets for this task, and allows for more nuanced investigation of change than is possible with static word vectors

    Investigations into the value of labeled and unlabeled data in biomedical entity recognition and word sense disambiguation

    Get PDF
    Human annotations, especially in highly technical domains, are expensive and time consuming togather, and can also be erroneous. As a result, we never have sufficiently accurate data to train andevaluate supervised methods. In this thesis, we address this problem by taking a semi-supervised approach to biomedical namedentity recognition (NER), and by proposing an inventory-independent evaluation framework for supervised and unsupervised word sense disambiguation. Our contributions are as follows: We introduce a novel graph-based semi-supervised approach to named entity recognition(NER) and exploit pre-trained contextualized word embeddings in several biomedical NER tasks. We propose a new evaluation framework for word sense disambiguation that permits a fair comparison between supervised methods trained on different sense inventories as well as unsupervised methods without a fixed sense inventory

    GeneSis: A Generative Approach to Substitutes in Context

    Get PDF
    The lexical substitution task aims at generating a list of suitable replacements for a target word in context, ideally keeping the meaning of the modified text unchanged. While its usage has increased in recent years, the paucity of annotated data prevents the finetuning of neural models on the task, hindering the full fruition of recently introduced powerful architectures such as language models. Furthermore, lexical substitution is usually evaluated in a framework that is strictly bound to a limited vocabulary, making it impossible to credit appropriate, but out-of-vocabulary, substitutes. To assess these issues, we propose GENESIS (Generating Substitutes in contexts), the first generative approach to lexical substitution. Thanks to a seq2seq model, we generate substitutes for a word according to the context it appears in, attaining state-of-theart results on different benchmarks. Moreover, our approach allows silver data to be produced for further improving the performances of lexical substitution systems. Along with an extensive analysis of GENESIS results, we also present a human evaluation of the generated substitutes in order to assess their quality. We release the fine-tuned models, the generated datasets and the code to reproduce the experiments at https://github.com/SapienzaNLP/genesis

    Unsupervised POS induction with word embeddings

    Get PDF
    Abstract Unsupervised word embeddings have been shown to be valuable as features in supervised learning problems; however, their role in unsupervised problems has been less thoroughly explored. In this paper, we show that embeddings can likewise add value to the problem of unsupervised POS induction. In two representative models of POS induction, we replace multinomial distributions over the vocabulary with multivariate Gaussian distributions over word embeddings and observe consistent improvements in eight languages. We also analyze the effect of various choices while inducing word embeddings on "downstream" POS induction results

    Word meaning in context : a probabilistic model and its application to question answering

    Get PDF
    The need for assessing similarity in meaning is central to most language technology applications. Distributional methods are robust, unsupervised methods which achieve high performance on this task. These methods measure similarity of word types solely based on patterns of word occurrences in large corpora, following the intuition that similar words occur in similar contexts. As most Natural Language Processing (NLP) applications deal with disambiguated words, words occurring in context, rather than word types, the question of adapting distributional methods to compute sense-specific or context-sensitive similarities has gained increasing attention in recent work. This thesis focuses on the development and applications of distributional methods for context-sensitive similarity. The contribution made is twofold: the main part of the thesis proposes and tests a new framework for computing similarity in context, while the second part investigates the application of distributional paraphrasing to the task of question answering.Die Notwendigkeit der Beurteilung von Bedeutungsähnlichkeit spielt für die meisten sprachtechnologische Anwendungen eine wesentliche Rolle. Distributionelle Verfahren sind solide, unbeaufsichtigte Verfahren, die für diese Aufgabe sehr effektiv sind. Diese Verfahren messen die Ähnlichkeit von Wortarten lediglich auf Basis von Mustern, nach denen die Wörter in großen Korpora vorkommen, indem sie der Erkenntnis folgen, dass ähnliche Wörter in ähnlichen Kontexten auftreten. Da die meisten Anwendungen im Natural Language Processing (NLP) mit eindeutigen Wörtern arbeiten, also eher Wörtern, die im Kontext vorkommen, als Wortarten, hat die Frage, ob distributionelle Verfahren angepasst werden sollten, um bedeutungsspezifische oder kontextabhängige Ähnlichkeiten zu berechnen, in neueren Arbeiten zunehmend an Bedeutung gewonnen. Diese Dissertation konzentriert sich auf die Entwicklung und Anwendungen von distributionellen Verfahren für kontextabhängige Ähnlichkeit und liefert einen doppelten Beitrag: Den Hauptteil der Arbeit bildet die Präsentation und Erprobung eines neuen framework für die Berechnung von Ähnlichkeit im Kontext. Im zweiten Teil der Arbeit wird die Anwendung des distributional paraphrasing auf die Aufgabe der Fragenbeantwortung untersucht
    corecore