4 research outputs found

    An Empirical Study of Span Modeling in Science NER

    Get PDF

    Win-Win Cooperation: Bundling Sequence and Span Models for Named Entity Recognition

    Full text link
    For Named Entity Recognition (NER), sequence labeling-based and span-based paradigms are quite different. Previous research has demonstrated that the two paradigms have clear complementary advantages, but few models have attempted to leverage these advantages in a single NER model as far as we know. In our previous work, we proposed a paradigm known as Bundling Learning (BL) to address the above problem. The BL paradigm bundles the two NER paradigms, enabling NER models to jointly tune their parameters by weighted summing each paradigm's training loss. However, three critical issues remain unresolved: When does BL work? Why does BL work? Can BL enhance the existing state-of-the-art (SOTA) NER models? To address the first two issues, we implement three NER models, involving a sequence labeling-based model--SeqNER, a span-based NER model--SpanNER, and BL-NER that bundles SeqNER and SpanNER together. We draw two conclusions regarding the two issues based on the experimental results on eleven NER datasets from five domains. We then apply BL to five existing SOTA NER models to investigate the third issue, consisting of three sequence labeling-based models and two span-based models. Experimental results indicate that BL consistently enhances their performance, suggesting that it is possible to construct a new SOTA NER system by incorporating BL into the current SOTA system. Moreover, we find that BL reduces both entity boundary and type prediction errors. In addition, we compare two commonly used labeling tagging methods as well as three types of span semantic representations

    Efficient Neural Methods for Coreference Resolution

    Get PDF
    Coreference resolution is a core task in natural language processing and in creating language technologies. Neural methods and models for automatically resolving references have emerged and developed over the last several years. This progress is largely marked by continuous improvements on a single dataset and metric. In this thesis, the assumptions that underlie these improvements are shown to be unrealistic for real-world use due to the computational and data tradeoffs made to achieve apparently high performance. The thesis outlines and proposes solutions to three issues. First, to address the growing memory requirements and restrictions on input document length, a novel, constant memory neural model for coreference resolution is proposed and shown to attain performance comparable to contemporary models. Second, to address the failure of these models to generalize across datasets, continued training is evaluated and shown to be successful for transferring coreference resolution models between domains and languages. Finally, to combat the gains obtained via the use of increasingly large pretrained language models, multitask model pruning can be applied to maintain a single (small) model for multiple datasets. These methods reduce the computational cost of running a model and the annotation cost of creating a model for any arbitrary dataset. As real-world applications continue to demand resolution of coreference, methods that reduce the technical cost of training new models and making predictions are greatly desired, which this thesis addresses
    corecore