249 research outputs found

    Investigations into the value of labeled and unlabeled data in biomedical entity recognition and word sense disambiguation

    Get PDF
    Human annotations, especially in highly technical domains, are expensive and time consuming togather, and can also be erroneous. As a result, we never have sufficiently accurate data to train andevaluate supervised methods. In this thesis, we address this problem by taking a semi-supervised approach to biomedical namedentity recognition (NER), and by proposing an inventory-independent evaluation framework for supervised and unsupervised word sense disambiguation. Our contributions are as follows: We introduce a novel graph-based semi-supervised approach to named entity recognition(NER) and exploit pre-trained contextualized word embeddings in several biomedical NER tasks. We propose a new evaluation framework for word sense disambiguation that permits a fair comparison between supervised methods trained on different sense inventories as well as unsupervised methods without a fixed sense inventory

    LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond

    Full text link
    Distributional semantics based on neural approaches is a cornerstone of Natural Language Processing, with surprising connections to human meaning representation as well. Recent Transformer-based Language Models have proven capable of producing contextual word representations that reliably convey sense-specific information, simply as a product of self-supervision. Prior work has shown that these contextual representations can be used to accurately represent large sense inventories as sense embeddings, to the extent that a distance-based solution to Word Sense Disambiguation (WSD) tasks outperforms models trained specifically for the task. Still, there remains much to understand on how to use these Neural Language Models (NLMs) to produce sense embeddings that can better harness each NLM's meaning representation abilities. In this work we introduce a more principled approach to leverage information from all layers of NLMs, informed by a probing analysis on 14 NLM variants. We also emphasize the versatility of these sense embeddings in contrast to task-specific models, applying them on several sense-related tasks, besides WSD, while demonstrating improved performance using our proposed approach over prior work focused on sense embeddings. Finally, we discuss unexpected findings regarding layer and model performance variations, and potential applications for downstream tasks.Comment: Accepted to Artificial Intelligence Journal (AIJ

    A comparison of homonym meaning frequency estimates derived from movie and television subtitles, free association, and explicit ratings

    Get PDF
    First Online: 10 September 2018Most words are ambiguous, with interpretation dependent on context. Advancing theories of ambiguity resolution is important for any general theory of language processing, and for resolving inconsistencies in observed ambiguity effects across experimental tasks. Focusing on homonyms (words such as bank with unrelated meanings EDGE OF A RIVER vs. FINANCIAL INSTITUTION), the present work advances theories and methods for estimating the relative frequency of their meanings, a factor that shapes observed ambiguity effects. We develop a new method for estimating meaning frequency based on the meaning of a homonym evoked in lines of movie and television subtitles according to human raters. We also replicate and extend a measure of meaning frequency derived from the classification of free associates. We evaluate the internal consistency of these measures, compare them to published estimates based on explicit ratings of each meaning’s frequency, and compare each set of norms in predicting performance in lexical and semantic decision mega-studies. All measures have high internal consistency and show agreement, but each is also associated with unique variance, which may be explained by integrating cognitive theories of memory with the demands of different experimental methodologies. To derive frequency estimates, we collected manual classifications of 533 homonyms over 50,000 lines of subtitles, and of 357 homonyms across over 5000 homonym–associate pairs. This database—publicly available at: www.blairarmstrong.net/homonymnorms/—constitutes a novel resource for computational cognitive modeling and computational linguistics, and we offer suggestions around good practices for its use in training and testing models on labeled data

    WRITTEN CORRECTIVE FEEDBACK: EFFECTS OF FOCUSED AND UNFOCUSED GRAMMAR CORRECTION ON THE CASE ACQUISITION IN L2 GERMAN

    Get PDF
    Thirty-three students of fourth semester German at the University Kansas participated in the study which sought to investigate whether focused written corrective feedback (WCF) promoted the acquisition of the German case morphology over the course of a semester. Participants received teacher WCF on five two-draft essay assignments under three treatment conditions: Group (1) received focused WCF on German case errors; group (2) received unfocused WCF on a variety of German grammar errors; and group (3) did not receive WCF on specific grammar errors. Combining quantitative and qualitative analyses, the study found that the focused group improved significantly in the accuracy of case forms while the unfocused and the control group did not make any apparent progress. The results indicate that focused WCF was effective in improving case accuracy in subjects' writings in German as a foreign language (GFL) context. WCF did not negatively affect writing fluency or students' attitude toward writing

    From archive to corpus: transcription and annotation in the creation of signed language corpora

    Get PDF
    PACLIC / The University of the Philippines Visayas Cebu College Cebu City, Philippines / November 20-22, 200

    New frontiers in supervised word sense disambiguation: building multilingual resources and neural models on a large scale

    Get PDF
    Word Sense Disambiguation is a long-standing task in Natural Language Processing (NLP), lying at the core of human language understanding. While it has already been studied from many different angles over the years, ranging from knowledge based systems to semi-supervised and fully supervised models, the field seems to be slowing down in respect to other NLP tasks, e.g., part-of-speech tagging and dependencies parsing. Despite the organization of several international competitions aimed at evaluating Word Sense Disambiguation systems, the evaluation of automatic systems has been problematic mainly due to the lack of a reliable evaluation framework aiming at performing a direct quantitative confrontation. To this end we develop a unified evaluation framework and analyze the performance of various Word Sense Disambiguation systems in a fair setup. The results show that supervised systems clearly outperform knowledge-based models. Among the supervised systems, a linear classifier trained on conventional local features still proves to be a hard baseline to beat. Nonetheless, recent approaches exploiting neural networks on unlabeled corpora achieve promising results, surpassing this hard baseline in most test sets. Even though supervised systems tend to perform best in terms of accuracy, they often lose ground to more flexible knowledge-based solutions, which do not require training for every disambiguation target. To bridge this gap we adopt a different perspective and rely on sequence learning to frame the disambiguation problem: we propose and study in depth a series of end-to-end neural architectures directly tailored to the task, from bidirectional Long ShortTerm Memory to encoder-decoder models. Our extensive evaluation over standard benchmarks and in multiple languages shows that sequence learning enables more versatile all-words models that consistently lead to state-of-the-art results, even against models trained with engineered features. However, supervised systems need annotated training corpora and the few available to date are of limited size: this is mainly due to the expensive and timeconsuming process of annotating a wide variety of word senses at a reasonably high scale, i.e., the so-called knowledge acquisition bottleneck. To address this issue, we also present different strategies to acquire automatically high quality sense annotated data in multiple languages, without any manual effort. We assess the quality of the sense annotations both intrinsically and extrinsically achieving competitive results on multiple tasks

    Intonation in Language Acquisition - Evidence from German

    Get PDF
    This dissertation studies the role of intonation in language acquisition. After a general introduction about the phonetic and phonological aspects of intonation and its different forms and functions within language, two different models of language acquisition and the role of intonation within these two models will be presented. Following this, I will present and discuss empirical data on the question, whether young German learning children use intonation in order to acquire language. Two comprehension studies will be presented. Here, I concentrate on the question whether children understand the referential function of intonation and whether they can use this knowledge in order to learn new words. Additionally, I will present empirical evidence that focuses on the question whether children use intonation in resolving participant roles in complex syntactic constructions as well as in resolving syntactic ambiguities development. Finally, I will present two production studies that investigate the prosodic realization of target referents that have different informational statuses within a discourse from both young children and parents, talking to their children. Overall, the data from these studies suggest that language learning children do use the intonational form of an utterance from early on in order to understand another´s intention. Young language learning children do understand that a certain intonational form conveys a function. Additionally, the studies presented in this thesis suggest that children also use intonation in order to convey their own communicative intentions. Thus, intonation is an important instrument for young children‘s language acquisition as they use the information that is provided by intonation, not only to learn words and to combine them to syntactic constructions, but also for the understanding of paralinguistic properties of language. The findings of the studies presented in this thesis are discussed with regard to different theories of language acquisition. Additionally, I will give insight into the understanding of the development of young children´s use of intonation

    Harnessing sense-level information for semantically augmented knowledge extraction

    Get PDF
    Nowadays, building accurate computational models for the semantics of language lies at the very core of Natural Language Processing and Artificial Intelligence. A first and foremost step in this respect consists in moving from word-based to sense-based approaches, in which operating explicitly at the level of word senses enables a model to produce more accurate and unambiguous results. At the same time, word senses create a bridge towards structured lexico-semantic resources, where the vast amount of available machine-readable information can help overcome the shortage of annotated data in many languages and domains of knowledge. This latter phenomenon, known as the knowledge acquisition bottlneck, is a crucial problem that hampers the development of large-scale, data-driven approaches for many Natural Language Processing tasks, especially when lexical semantics is directly involved. One of these tasks is Information Extraction, where an effective model has to cope with data sparsity, as well as with lexical ambiguity that can arise at the level of both arguments and relational phrases. Even in more recent Information Extraction approaches where semantics is implicitly modeled, these issues have not yet been addressed in their entirety. On the other hand, however, having access to explicit sense-level information is a very demanding task on its own, which can rarely be performed with high accuracy on a large scale. With this in mind, in ths thesis we will tackle a two-fold objective: our first focus will be on studying fully automatic approaches to obtain high-quality sense-level information from textual corpora; then, we will investigate in depth where and how such sense-level information has the potential to enhance the extraction of knowledge from open text. In the first part of this work, we will explore three different disambiguation scenar- ios (semi-structured text, parallel text, and definitional text) and devise automatic disambiguation strategies that are not only capable of scaling to different corpus sizes and different languages, but that actually take advantage of a multilingual and/or heterogeneous setting to improve and refine their performance. As a result, we will obtain three sense-annotated resources that, when tested experimentally with a baseline system in a series of downstream semantic tasks (i.e. Word Sense Disam- biguation, Entity Linking, Semantic Similarity), show very competitive performances on standard benchmarks against both manual and semi-automatic competitors. In the second part we will instead focus on Information Extraction, with an emphasis on Open Information Extraction (OIE), where issues like sparsity and lexical ambiguity are especially critical, and study how to exploit at best sense-level information within the extraction process. We will start by showing that enforcing a deeper semantic analysis in a definitional setting enables a full-fledged extraction pipeline to compete with state-of-the-art approaches based on much larger (but noisier) data. We will then demonstrate how working at the sense level at the end of an extraction pipeline is also beneficial: indeed, by leveraging sense-based techniques, very heterogeneous OIE-derived data can be aligned semantically, and unified with respect to a common sense inventory. Finally, we will briefly shift the focus to the more constrained setting of hypernym discovery, and study a sense-aware supervised framework for the task that is robust and effective, even when trained on heterogeneous OIE-derived hypernymic knowledge
    • …
    corecore