Back to Patterns: Efficient Japanese Morphological Analysis with
  Feature-Sequence Trie

Yoshinaga, Naoki

Back to Patterns: Efficient Japanese Morphological Analysis with Feature-Sequence Trie

Authors: Naoki Yoshinaga
Publication date: 30 May 2023
Publisher

Abstract

Accurate neural models are much less efficient than non-neural models and are useless for processing billions of social media posts or handling user queries in real time with a limited budget. This study revisits the fastest pattern-based NLP methods to make them as accurate as possible, thus yielding a strikingly simple yet surprisingly accurate morphological analyzer for Japanese. The proposed method induces reliable patterns from a morphological dictionary and annotated data. Experimental results on two standard datasets confirm that the method exhibits comparable accuracy to learning-based baselines, while boasting a remarkable throughput of over 1,000,000 sentences per second on a single modern CPU. The source code is available at https://www.tkl.iis.u-tokyo.ac.jp/~ynaga/jagger/Comment: 9 pages, 1 figure, 10 tables, Accepted by ACL 2023 (main conference

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2305.19045

Last time updated on 02/06/2023