11 research outputs found

    Analysis of Tweets for Social Media Health Applications

    Get PDF
    abstract: Social networking sites like Twitter have provided people a platform to connect with each other, to discuss and share information and news or to entertain themselves. As the number of users continues to grow there has been explosive growth in the data generated by these users. Such a vast data source has provided researchers a way to study and monitor public health. Accurately analyzing tweets is a difficult task mainly because of their short length, the inventive spellings and creative language expressions. Instead of focusing at the topic level, identifying tweets that have personal health experience mentions would be more helpful to researchers, governments and other organizations. Another important limitation in the current systems for social media health applications is the use of a disease-specific model and dataset to study a particular disease. Identifying adverse drug reactions is an important part of the drug development process. Detecting and extracting adverse drug mentions in tweets can supplement the list of adverse drug reactions that result from the drug trials and can help in the improvement of the drugs. This thesis aims to address these two challenges and proposes three systems. A generalizable system to identify personal health experience mentions across different disease domains, a system for automatic classifications of adverse effects mentions in tweets and a system to extract adverse drug mentions from tweets. The proposed systems use the transfer learning from language models to achieve notable scores on Social Media Mining for Health Applications(SMM4H) 2019 (Weissenbacher et al. 2019) shared tasks.Dissertation/ThesisMasters Thesis Computer Science 201

    Bi-Encoders based Species Normalization -- Pairwise Sentence Learning to Rank

    Full text link
    Motivation: Biomedical named-entity normalization involves connecting biomedical entities with distinct database identifiers in order to facilitate data integration across various fields of biology. Existing systems for biomedical named entity normalization heavily rely on dictionaries, manually created rules, and high-quality representative features such as lexical or morphological characteristics. However, recent research has investigated the use of neural network-based models to reduce dependence on dictionaries, manually crafted rules, and features. Despite these advancements, the performance of these models is still limited due to the lack of sufficiently large training datasets. These models have a tendency to overfit small training corpora and exhibit poor generalization when faced with previously unseen entities, necessitating the redesign of rules and features. Contribution: We present a novel deep learning approach for named entity normalization, treating it as a pair-wise learning to rank problem. Our method utilizes the widely-used information retrieval algorithm Best Matching 25 to generate candidate concepts, followed by the application of bi-directional encoder representation from the encoder (BERT) to re-rank the candidate list. Notably, our approach eliminates the need for feature-engineering or rule creation. We conduct experiments on species entity types and evaluate our method against state-of-the-art techniques using LINNAEUS and S800 biomedical corpora. Our proposed approach surpasses existing methods in linking entities to the NCBI taxonomy. To the best of our knowledge, there is no existing neural network-based approach for species normalization in the literature

    Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP

    Get PDF
    Transfer learning, particularly approaches that combine multi-task learning with pre-trained contextualized embeddings and fine-tuning, have advanced the field of Natural Language Processing tremendously in recent years. In this paper we present MaChAmp, a toolkit for easy fine-tuning of contextualized embeddings in multi-task settings. The benefits of MaChAmp are its flexible configuration options, and the support of a variety of natural language processing tasks in a uniform toolkit, from text classification and sequence labeling to dependency parsing, masked language modeling, and text generation.Comment: https://machamp-nlp.github.io

    Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-10)

    Full text link

    A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

    Full text link
    Large language models (LLMs) are a special class of pretrained language models obtained by scaling model size, pretraining corpus and computation. LLMs, because of their large size and pretraining on large volumes of text data, exhibit special abilities which allow them to achieve remarkable performances without any task-specific training in many of the natural language processing tasks. The era of LLMs started with OpenAI GPT-3 model, and the popularity of LLMs is increasing exponentially after the introduction of models like ChatGPT and GPT4. We refer to GPT-3 and its successor OpenAI models, including ChatGPT and GPT4, as GPT-3 family large language models (GLLMs). With the ever-rising popularity of GLLMs, especially in the research community, there is a strong need for a comprehensive survey which summarizes the recent research progress in multiple dimensions and can guide the research community with insightful future research directions. We start the survey paper with foundation concepts like transformers, transfer learning, self-supervised learning, pretrained language models and large language models. We then present a brief overview of GLLMs and discuss the performances of GLLMs in various downstream tasks, specific domains and multiple languages. We also discuss the data labelling and data augmentation abilities of GLLMs, the robustness of GLLMs, the effectiveness of GLLMs as evaluators, and finally, conclude with multiple insightful future research directions. To summarize, this comprehensive survey paper will serve as a good resource for both academic and industry people to stay updated with the latest research related to GPT-3 family large language models.Comment: Preprint under review, 58 page

    Sentiment polarity shifters : creating lexical resources through manual annotation and bootstrapped machine learning

    Get PDF
    Alleviating pain is good and abandoning hope is bad. We instinctively understand how words like "alleviate" and "abandon" affect the polarity of a phrase, inverting or weakening it. When these words are content words, such as verbs, nouns and adjectives, we refer to them as polarity shifters. Shifters are a frequent occurrence in human language and an important part of successfully modeling negation in sentiment analysis; yet research on negation modeling has focussed almost exclusively on a small handful of closed class negation words, such as "not", "no" and "without. A major reason for this is that shifters are far more lexically diverse than negation words, but no resources exist to help identify them. We seek to remedy this lack of shifter resources. Our most central step towards this is the creation of a large lexicon of polarity shifters that covers verbs, nouns and adjectives. To reduce the prohibitive cost of such a large annotation task, we develop a bootstrapping approach that combines automatic classification with human verification. This ensures the high quality of our lexicon while reducing annotation cost by over 70%. In designing the bootstrap classifier we develop a variety of features which use both existing semantic resources and linguistically informed text patterns. In addition we investigate how knowledge about polarity shifters might be shared across different parts of speech, highlighting both the potential and limitations of such an approach. The applicability of our bootstrapping approach extends beyond the creation of a single resource. We show how it can further be used to introduce polarity shifter resources for other languages. Through the example case of German we show that all our features are transferable to other languages. Keeping in mind the requirements of under-resourced languages, we also explore how well a classifier would do when relying only on data- but not resource-driven features. We also introduce ways to use cross-lingual information, leveraging the shifter resources we previously created for other languages. Apart from the general question of which words can be polarity shifters, we also explore a number of other factors. One of these is the matter of shifting directions, which indicates whether a shifter affects positive polarities, negative polarities or whether it can shift in either direction. Using a supervised classifier we add shifting direction information to our bootstrapped lexicon. For other aspects of polarity shifting, manual annotation is preferable to automatic classification. Not every word that can cause polarity shifting does so for every of its word senses. As word sense disambiguation technology is not robust enough to allow the automatic handling of such nuances, we manually create a complete sense-level annotation of verbal polarity shifters. To verify the usefulness of the lexica which we create, we provide an extrinsic evaluation in which we apply them to a sentiment analysis task. In this task the different lexica are not only compared amongst each other, but also against a state-of-the-art compositional polarity neural network classifier that has been shown to be able to implicitly learn the negating effect of negation words from a training corpus. However, we find that the same is not true for the far more lexically diverse polarity shifters. Instead, the use of the explicit knowledge provided by our shifter lexica brings clear gains in performance.Deutsche Forschungsgesellschaf

    Incremental Coreference Resolution for German

    Full text link
    The main contributions of this thesis are as follows: 1. We introduce a general model for coreference and explore its application to German. • The model features an incremental discourse processing algorithm which allows it to coherently address issues caused by underspecification of mentions, which is an especially pressing problem regarding certain German pronouns. • We introduce novel features relevant for the resolution of German pronouns. A subset of these features are made accessible through the incremental architecture of the discourse processing model. • In evaluation, we show that the coreference model combined with our features provides new state-of-the-art results for coreference and pronoun resolution for German. 2. We elaborate on the evaluation of coreference and pronoun resolution. • We discuss evaluation from the view of prospective downstream applications that benefit from coreference resolution as a preprocessing component. Addressing the shortcomings of the general evaluation framework in this regard, we introduce an alternative framework, the Application Related Coreference Scores (ARCS). • The ARCS framework enables a thorough comparison of different system outputs and the quantification of their similarities and differences beyond the common coreference evaluation. We demonstrate how the framework is applied to state-of-the-art coreference systems. This provides a method to track specific differences in system outputs, which assists researchers in comparing their approaches to related work in detail. 3. We explore semantics for pronoun resolution. • Within the introduced coreference model, we explore distributional approaches to estimate the compatibility of an antecedent candidate and the occurrence context of a pronoun. We compare a state-of-the-art approach for word embeddings to syntactic co-occurrence profiles to this end. • In comparison to related work, we extend the notion of context and thereby increase the applicability of our approach. We find that a combination of both compatibility models, coupled with the coreference model, provides a large potential for improving pronoun resolution performance. We make available all our resources, including a web demo of the system, at: http://pub.cl.uzh.ch/purl/coreference-resolutio

    UZH in BioNLP 2013

    Full text link
    corecore