5,028 research outputs found

    Proceedings of the 28th International Conference on Computational Linguistics

    Get PDF
    Named entity recognition (NER) is frequently addressed as a sequence classification task with each input consisting of one sentence of text. It is nevertheless clear that useful information for NER is often found also elsewhere in text. Recent self-attention models like BERT can both capture long-distance relationships in input and represent inputs consisting of several sentences. This creates opportunities for adding cross-sentence information in natural language processing tasks. This paper presents a systematic study exploring the use of cross-sentence information for NER using BERT models in five languages. We find that adding context as additional sentences to BERT input systematically increases NER performance. Multiple sentences in input samples allows us to study the predictions of the sentences in different contexts. We propose a straightforward method, Contextual Majority Voting (CMV), to combine these different predictions and demonstrate this to further increase NER performance. Evaluation on established datasets, including the CoNLL’02 and CoNLL’03 NER benchmarks, demonstrates that our proposed approach can improve on the state-of-the-art NER results on English, Dutch, and Finnish, achieves the best reported BERT-based results on German, and is on par with other BERT-based approaches in Spanish. We release all methods implemented in this work under open licenses.</p

    Cross-sentence contexts in Named Entity Recognition with BERT

    Get PDF
    Named entity recognition (NER) is a task under the broader scope of Natural Language Processing (NLP). The computational task of NER is often cast as a sequence classification task where the goal is to label each word (or token) in the input sequence with a class from a predefined set of classes. The development of deep transfer learning methodologies in recent years has greatly influenced both NLP and NER. There have been improvements in the performance of NER models but at the same time the use of cross-sentence context, the sentences around the sentence of interest, has diminished in NER methods. Many of the current methods use inputs that consist of only one sentence of text at a time. It is nevertheless clear that useful information for NER is often found also elsewhere in text. Recent self-attention models like BERT can both capture long-distance relationships in input and represent inputs consisting of several sentences. This creates opportunities for making use of cross-sentence information in NLP tasks. This thesis presents a systematic study exploring the use of cross-sentence information for NER using BERT models in five languages. The study shows that adding context as additional sentences to BERT input systematically increases NER performance. Adding multiple sentences in input samples also allows the study of predictions for the sentences in different contexts. A straightforward method of Contextual Majority Voting (CMV) is proposed to combine these different predictions. The study demonstrates that using CMV increases NER performance even further. Evaluation of the proposed methods on established datasets, including the Conference on Computational Natural Language Learning CoNLL'02 and CoNLL'03 NER benchmarks, demonstrates that the proposed approach can improve on the state-of-the-art NER results for English, Dutch, and Finnish, achieves the best reported BERT-based results for German, and is on par with other BERT-based approaches for Spanish. The methods implemented for this work are published under open licenses

    Semantics-driven Abstractive Document Summarization

    Get PDF
    The evolution of the Web over the last three decades has led to a deluge of scientific and news articles on the Internet. Harnessing these publications in different fields of study is critical to effective end user information consumption. Similarly, in the domain of healthcare, one of the key challenges with the adoption of Electronic Health Records (EHRs) for clinical practice has been the tremendous amount of clinical notes generated that can be summarized without which clinical decision making and communication will be inefficient and costly. In spite of the rapid advances in information retrieval and deep learning techniques towards abstractive document summarization, the results of these efforts continue to resemble extractive summaries, achieving promising results predominantly on lexical metrics but performing poorly on semantic metrics. Thus, abstractive summarization that is driven by intrinsic and extrinsic semantics of documents is not adequately explored. Resources that can be used for generating semantics-driven abstractive summaries include: • Abstracts of multiple scientific articles published in a given technical field of study to generate an abstractive summary for topically-related abstracts within the field, thus reducing the load of having to read semantically duplicate abstracts on a given topic. • Citation contexts from different authoritative papers citing a reference paper can be used to generate utility-oriented abstractive summary for a scientific article. • Biomedical articles and the named entities characterizing the biomedical articles along with background knowledge bases to generate entity and fact-aware abstractive summaries. • Clinical notes of patients and clinical knowledge bases for abstractive clinical text summarization using knowledge-driven multi-objective optimization. In this dissertation, we develop semantics-driven abstractive models based on intra- document and inter-document semantic analyses along with facts of named entities retrieved from domain-specific knowledge bases to produce summaries. Concretely, we propose a sequence of frameworks leveraging semantics at various granularity (e.g., word, sentence, document, topic, citations, and named entities) levels, by utilizing external resources. The proposed frameworks have been applied to a range of tasks including 1. Abstractive summarization of topic-centric multi-document scientific articles and news articles. 2. Abstractive summarization of scientific articles using crowd-sourced citation contexts. 3. Abstractive summarization of biomedical articles clustered based on entity-relatedness. 4. Abstractive summarization of clinical notes of patients with heart failure and Chest X-Rays recordings. The proposed approaches achieve impressive performance in terms of preserving semantics in abstractive summarization while paraphrasing. For summarization of topic-centric multiple scientific/news articles, we propose a three-stage approach where abstracts of scientific articles or news articles are clustered based on their topical similarity determined from topics generated using Latent Dirichlet Allocation (LDA), followed by extractive phase and abstractive phase. Then, in the next stage, we focus on abstractive summarization of biomedical literature where we leverage named entities in biomedical articles to 1) cluster related articles; and 2) leverage the named entities towards guiding abstractive summarization. Finally, in the last stage, we turn to external resources such as citation contexts pointing to a scientific article to generate a comprehensive and utility-centric abstractive summary of a scientific article, domain-specific knowledge bases to fill gaps in information about entities in a biomedical article to summarize and clinical notes to guide abstractive summarization of clinical text. Thus, the bottom-up progression of exploring semantics towards abstractive summarization in this dissertation starts with (i) Semantic Analysis of Latent Topics; builds on (ii) Internal and External Knowledge-I (gleaned from abstracts and Citation Contexts); and extends it to make it comprehensive using (iii) Internal and External Knowledge-II (Named Entities and Knowledge Bases)

    Leveraging Context Patterns for Medical Entity Classification

    Get PDF
    The ability of patients to understand health-related text is important for optimal health outcomes. A system that can automatically annotate medical entities could help patients better understand health-related text. Such a system would also accelerate manual data annotation for this low-resource domain as well as assist in down- stream medical NLP tasks such as finding textual similarity, identifying conflicting medical advice, and aspect-based sentiment analysis. In this work, we investigate a state-of-the-art entity set expansion model, BootstrapNet, for the task of medical entity classification on a new dataset of medical advice text. We also propose EP SBERT, a simple model that utilizes Sentence-BERT embeddings of entities and context patterns to more effectively capture the semantics of the entities. Our experiments show that EP SBERT significantly outperforms a random classifier baseline, outperforms the more complex BootstrapNet by 5.2 F1 points, and achieves a 5-fold cross validated weighted F1 score of 0.835. Further experiments show that EP SBERT achieves a weighted F1 score of 0.870 when we remove a peripheral class whose inclusion is nonessential to the problem formulation, and a weighted F1 score of 0.949 when using top-2 evaluation. This makes us confident that EP SBERT can be useful when building human-in-the-loop data annotation tools. Finally, we perform an extensive error analysis of EP SBERT, identifying two core challenges and future work. Our code will be made available at https://github.com/garrettjohnston99/EP-SBERT

    Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge

    Full text link
    Although named entity recognition (NER) helps us to extract domain-specific entities from text (e.g., artists in the music domain), it is costly to create a large amount of training data or a structured knowledge base to perform accurate NER in the target domain. Here, we propose self-adaptive NER, which retrieves external knowledge from unstructured text to learn the usages of entities that have not been learned well. To retrieve useful knowledge for NER, we design an effective two-stage model that retrieves unstructured knowledge using uncertain entities as queries. Our model predicts the entities in the input and then finds those of which the prediction is not confident. Then, it retrieves knowledge by using these uncertain entities as queries and concatenates the retrieved text to the original input to revise the prediction. Experiments on CrossNER datasets demonstrated that our model outperforms strong baselines by 2.35 points in F1 metric.Comment: EACL2023 (long

    Comparative Analysis of Contextual Relation Extraction based on Deep Learning Models

    Full text link
    Contextual Relation Extraction (CRE) is mainly used for constructing a knowledge graph with a help of ontology. It performs various tasks such as semantic search, query answering, and textual entailment. Relation extraction identifies the entities from raw texts and the relations among them. An efficient and accurate CRE system is essential for creating domain knowledge in the biomedical industry. Existing Machine Learning and Natural Language Processing (NLP) techniques are not suitable to predict complex relations from sentences that consist of more than two relations and unspecified entities efficiently. In this work, deep learning techniques have been used to identify the appropriate semantic relation based on the context from multiple sentences. Even though various machine learning models have been used for relation extraction, they provide better results only for binary relations, i.e., relations occurred exactly between the two entities in a sentence. Machine learning models are not suited for complex sentences that consist of the words that have various meanings. To address these issues, hybrid deep learning models have been used to extract the relations from complex sentence effectively. This paper explores the analysis of various deep learning models that are used for relation extraction.Comment: This Paper Presented in the International Conference on FOSS Approaches towards Computational Intelligence and Language TTechnolog on February 2023, Thiruvananthapura
    • …
    corecore