Location of Repository

Domain Adaptation of Rule-based Annotators for Named-Entity Recognition Tasks

By Laura Chiticariu, Rajasekar Krishnamurthy, Yunyao Li, Frederick Reiss and Shivakumar Vaithyanathan

Abstract

Named-entity recognition (NER) is an important task required in a wide variety of applications. While rule-based systems are appealing due to their well-known “explainability,” most, if not all, state-of-the-art results for NER tasks are based on machine learning techniques. Motivated by these results, we explore the following natural question in this paper: Are rule-based systems still a viable approach to named-entity recognition? Specifically, we have designed and implemented a high-level language NERL on top of SystemT, a general-purpose algebraic information extraction system. NERL is tuned to the needs of NER tasks and simplifies the process of building, understanding, and customizing complex rule-based named-entity annotators. We show that these customized annotators match or outperform the best published results achieved with machine learning techniques. These results confirm that we can reap the benefits of rule-based extractors ’ explainability without sacrificing accuracy. We conclude by discussing lessons learned while building and customizing complex rule-based annotators and outlining several research directions towards facilitating rule development.

Year: 2010
OAI identifier: oai:CiteSeerX.psu:10.1.1.173.697
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.cse.ucsc.edu/%7Elau... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.