Named-entity recognition (NER) is an important task required in a wide variety of applications. While rule-based systems are appealing due to their well-known “explainability,” most, if not all, state-of-the-art results for NER tasks are based on machine learning techniques. Motivated by these results, we explore the following natural question in this paper: Are rule-based systems still a viable approach to named-entity recognition? Specifically, we have designed and implemented a high-level language NERL on top of SystemT, a general-purpose algebraic information extraction system. NERL is tuned to the needs of NER tasks and simplifies the process of building, understanding, and customizing complex rule-based named-entity annotators. We show that these customized annotators match or outperform the best published results achieved with machine learning techniques. These results confirm that we can reap the benefits of rule-based extractors ’ explainability without sacrificing accuracy. We conclude by discussing lessons learned while building and customizing complex rule-based annotators and outlining several research directions towards facilitating rule development.
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.