1,353 research outputs found
Recommended from our members
Lexical patterns, features and knowledge resources for coreference resolution in clinical notes
Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general- purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA). In addition, a method for generating coreference chains using progressively pruned linked lists is demonstrated that reduces the search space and facilitates evaluation by a number of metrics. Independent evaluation results show an F-measure for each corpus of 79.2% and 87.5%, respectively, which offers performance at least as good as human annotators, greatly increased performance over general- purpose tools, and improvement on previously reported clinical coreference systems. The system uses a number of open-source components that are available to download
Recommended from our members
Coreference resolution in clinical discharge summaries, progress notes, surgical and pathology reports: a unified lexical approach
We developed a lexical rule-based system that uses a unified approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA) provided for the fifth i2b2/VA shared task. Taking the unweighted mean between 4 coreference metrics, validation of the system against the i2b2/VA corpus attained an overall F-score of 87.7% across all mention classes, with a maximum of 93.1% for coreference of persons, and a minimum of 77.2% for coreference of tests. For the ODIE corpus the overall F-score across all mention classes was 79.4%, with a maximum of 82.0% for coreference of persons and a minimum of 13.1% for coreference of diagnostic reagents. For the ODIE corpus our results are comparable to the mean reported inter-annotator agreement with the gold standard. We discuss the four categories of errors we identified, and how these might be addressed. The system uses a number of reusable modules and techniques that may be of benefit to the research community
End-to-end Neural Coreference Resolution
We introduce the first end-to-end coreference resolution model and show that
it significantly outperforms all previous work without using a syntactic parser
or hand-engineered mention detector. The key idea is to directly consider all
spans in a document as potential mentions and learn distributions over possible
antecedents for each. The model computes span embeddings that combine
context-dependent boundary representations with a head-finding attention
mechanism. It is trained to maximize the marginal likelihood of gold antecedent
spans from coreference clusters and is factored to enable aggressive pruning of
potential mentions. Experiments demonstrate state-of-the-art performance, with
a gain of 1.5 F1 on the OntoNotes benchmark and by 3.1 F1 using a 5-model
ensemble, despite the fact that this is the first approach to be successfully
trained with no external resources.Comment: Accepted to EMNLP 201
USFD at KBP 2011: Entity Linking, Slot Filling and Temporal Bounding
This paper describes the University of Sheffield's entry in the 2011 TAC KBP
entity linking and slot filling tasks. We chose to participate in the
monolingual entity linking task, the monolingual slot filling task and the
temporal slot filling tasks. We set out to build a framework for
experimentation with knowledge base population. This framework was created, and
applied to multiple KBP tasks. We demonstrated that our proposed framework is
effective and suitable for collaborative development efforts, as well as useful
in a teaching environment. Finally we present results that, while very modest,
provide improvements an order of magnitude greater than our 2010 attempt.Comment: Proc. Text Analysis Conference (2011
- …