220 research outputs found
Recommended from our members
Lexical patterns, features and knowledge resources for coreference resolution in clinical notes
Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general- purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA). In addition, a method for generating coreference chains using progressively pruned linked lists is demonstrated that reduces the search space and facilitates evaluation by a number of metrics. Independent evaluation results show an F-measure for each corpus of 79.2% and 87.5%, respectively, which offers performance at least as good as human annotators, greatly increased performance over general- purpose tools, and improvement on previously reported clinical coreference systems. The system uses a number of open-source components that are available to download
A Benchmark of Rule-Based and Neural Coreference Resolution in Dutch Novels and News
We evaluate a rule-based (Lee et al., 2013) and neural (Lee et al., 2018)
coreference system on Dutch datasets of two domains: literary novels and
news/Wikipedia text. The results provide insight into the relative strengths of
data-driven and knowledge-driven systems, as well as the influence of domain,
document length, and annotation schemes. The neural system performs best on
news/Wikipedia text, while the rule-based system performs best on literature.
The neural system shows weaknesses with limited training data and long
documents, while the rule-based system is affected by annotation differences.
The code and models used in this paper are available at
https://github.com/andreasvc/crac2020Comment: Accepted for CRAC 2020 @ COLIN
A Benchmark of Rule-Based and Neural Coreference Resolution in Dutch Novels and News
We evaluate a rule-based (Lee et al., 2013) and neural (Lee et al., 2018)
coreference system on Dutch datasets of two domains: literary novels and
news/Wikipedia text. The results provide insight into the relative strengths of
data-driven and knowledge-driven systems, as well as the influence of domain,
document length, and annotation schemes. The neural system performs best on
news/Wikipedia text, while the rule-based system performs best on literature.
The neural system shows weaknesses with limited training data and long
documents, while the rule-based system is affected by annotation differences.
The code and models used in this paper are available at
https://github.com/andreasvc/crac2020Comment: Accepted for CRAC 2020 @ COLIN
- …