4 research outputs found

    UW-BHI at MEDIQA 2019: An Analysis of Representation Methods for Medical Natural Language Inference

    Full text link
    Recent advances in distributed language modeling have led to large performance increases on a variety of natural language processing (NLP) tasks. However, it is not well understood how these methods may be augmented by knowledge-based approaches. This paper compares the performance and internal representation of an Enhanced Sequential Inference Model (ESIM) between three experimental conditions based on the representation method: Bidirectional Encoder Representations from Transformers (BERT), Embeddings of Semantic Predications (ESP), or Cui2Vec. The methods were evaluated on the Medical Natural Language Inference (MedNLI) subtask of the MEDIQA 2019 shared task. This task relied heavily on semantic understanding and thus served as a suitable evaluation set for the comparison of these representation methods

    Probing Pre-Trained Language Models for Disease Knowledge

    Get PDF
    Pre-trained language models such as ClinicalBERT have achieved impressive results on tasks such as medical Natural Language Inference. At first glance, this may suggest that these models are able to perform medical reasoning tasks, such as mapping symptoms to diseases. However, we find that standard benchmarks such as MedNLI contain relatively few examples that require such forms of reasoning. To better understand the medical reasoning capabilities of existing language models, in this paper we introduce DisKnE, a new benchmark for Disease Knowledge Evaluation. To construct this benchmark, we annotated each positive MedNLI example with the types of medical reasoning that are needed. We then created negative examples by corrupting these positive examples in an adversarial way. Furthermore, we define training-test splits per disease, ensuring that no knowledge about test diseases can be learned from the training data, and we canonicalize the formulation of the hypotheses to avoid the presence of artefacts. This leads to a number of binary classification problems, one for each type of reasoning and each disease. When analysing pre-trained models for the clinical/biomedical domain on the proposed benchmark, we find that their performance drops considerably.Comment: Accepted by ACL 2021 Finding

    Probing pre-trained language models for disease knowledge

    Get PDF

    Interpreting patient case descriptions with biomedical language models

    Get PDF
    The advent of pre-trained language models (LMs) has enabled unprecedented advances in the Natural Language Processing (NLP) field. In this respect, various specialised LMs for the biomedical domain have been introduced, and similar to their general purpose counterparts, these models have achieved state-of-the-art results in many biomedical NLP tasks. Accordingly, it can be assumed that they can perform medical reasoning. However, given the challenging nature of the biomedical domain and the scarcity of labelled data, it is still not fully understood what type of knowledge these models encapsulate and how they can be enhanced further. This research seeks to address these questions, with a focus on the task of interpreting patient case descriptions, which provides the means to investigate the model’s ability to perform medical reasoning. In general, this task is concerned with inferring a diagnosis or recommending a treatment from a text fragment describing a set of symptoms accompanied by other information. Therefore, we started by probing pre-trained language models. For this purpose, we constructed a benchmark that is derived from an existing dataset (MedNLI). Following that, to improve the performance of LMs, we used a distant supervision strategy to identify cases that are similar to a given one. We then showed that using such similar cases can lead to better results than other strategies for augmenting the input to the LM. As a final contribution, we studied the possibility of fine-tuning biomedical LMs on PubMed abstracts that correspond to case reports. In particular, we proposed a self-supervision task which mimics the downstream tasks of inferring diagnoses and recommending treatments. The findings in this thesis indicate that the performance of the considered biomedical LMs can be improved by using methods that go beyond relying on additional manually annotated datasets
    corecore