1,772 research outputs found
Real Anaphora Resolution is Hard
We introduce a system for anaphora resolution for German that uses various resources in order to develop a real system as opposed to systems based on idealized assumptions, e.g. the use of true mentions only or perfect parse trees and perfect morphology. The components that we use to replace such idealizations comprise a full-fledged morphology, a Wikipedia-based named entity recognition, a rule-based dependency parser and a German wordnet. We show that under these conditions coreference resolution is (at least for German) still far from being perfect
Recommended from our members
Lexical patterns, features and knowledge resources for coreference resolution in clinical notes
Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general- purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA). In addition, a method for generating coreference chains using progressively pruned linked lists is demonstrated that reduces the search space and facilitates evaluation by a number of metrics. Independent evaluation results show an F-measure for each corpus of 79.2% and 87.5%, respectively, which offers performance at least as good as human annotators, greatly increased performance over general- purpose tools, and improvement on previously reported clinical coreference systems. The system uses a number of open-source components that are available to download
Universal Dependencies Parsing for Colloquial Singaporean English
Singlish can be interesting to the ACL community both linguistically as a
major creole based on English, and computationally for information extraction
and sentiment analysis of regional social media. We investigate dependency
parsing of Singlish by constructing a dependency treebank under the Universal
Dependencies scheme, and then training a neural network model by integrating
English syntactic knowledge into a state-of-the-art parser trained on the
Singlish treebank. Results show that English knowledge can lead to 25% relative
error reduction, resulting in a parser of 84.47% accuracies. To the best of our
knowledge, we are the first to use neural stacking to improve cross-lingual
dependency parsing on low-resource languages. We make both our annotation and
parser available for further research.Comment: Accepted by ACL 201
Anaphora Resolution and Text Retrieval
Empirical approaches based on qualitative or quantitative methods of corpus linguistics have become a central paradigm within linguistics. The series takes account of this fact and provides a platform for approaches within synchronous linguistics as well as interdisciplinary works with a linguistic focus which devise new ways of working empirically and develop new data-based methods and theoretical models for empirical linguistic analyses
An evaluation of syntactic simplification rules for people with autism
Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR) at the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2014)Syntactically complex sentences constitute an obstacle for some people with Autistic Spectrum Disorders. This paper evaluates a set of simplification rules specifically designed for tackling complex and compound sentences. In total, 127 different rules were developed for the rewriting of complex sentences and 56 for the rewriting of compound sentences. The evaluation assessed the accuracy of these rules individually and revealed that fully automatic conversion of these sentences into a more accessible form is not very reliable.EC FP7-ICT-2011-
- …