Search CORE

2 research outputs found

A corpus-based learning method of compound noun indexing rules for Korean

Author: Kim JH
Kwak BK
Lee G
Lee JH
Lee S
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

In Korean information retrieval, compound nouns play an important role in improving precision in search experiments. There are two major approaches to compound noun indexing in Korean: statistical and linguistic. Each method, however, has its own shortcomings, such as limitations when indexing diverse types of compound nouns, over-generation of compound nouns, and data sparseness in training. In this paper, we propose a corpus-based learning method, which can index diverse types of compound nouns using rules automatically extracted from a large corpus. The automatic learning method is more portable and requires less human effort, although it exhibits a performance level similar to the manual-linguistic approach. We also present a new filtering method to solve the problems of compound noun over-generation and data sparseness.X118sciescopu

포항공과대학교