Search CORE

3 research outputs found

Sentential Paraphrase Generation for Agglutinative Languages Using SVM with a String Kernel

Author: Choi Ho-Jin
Gweon Gahgene
Heo Jeong
Park Hancheol
Ryu Pum-Mo
Publication venue: Department of Linguistics, Faculty of Arts, Chulalongkorn University
Publication date: 01/01/2014
Field of study

Waseda University Repository

Enlarging Paraphrase Collections through Generalization and Instantiation

Author: Fujita Atsushi
Isabelle Pierre
Kuhn Roland
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date
Field of study

Institutional Repositories DataBase (IRDB)

Enlarging paraphrase collections through generalization and instantiation

Author: Fujita Atsushi
Isabelle Pierre
Kuhn Roland
Publication venue
Publication date
Field of study

This paper presents a paraphrase acquisition method that uncovers and exploits generalities underlying paraphrases: paraphrase patterns are first induced and then used to collect novel instances. Unlike existing methods, ours uses both bilingual parallel and monolingual corpora. While the former are regarded as a source of high-quality seed paraphrases, the latter are searched for paraphrases that match patterns learned from the seed paraphrases. We show how one can use monolingual corpora, which are far more numerous and larger than bilingual corpora, to obtain paraphrases that rival in quality those derived directly from bilingual corpora. In our experiments, the number of paraphrase pairs obtained in this way from monolingual corpora was a large multiple of the number of seed paraphrases. Human evaluation through a paraphrase substitution test demonstrated that the newly acquired paraphrase pairs are of reasonable quality. Remaining noise can be further reduced by filtering seed paraphrases.Peer reviewed: YesNRC publication: Ye

NRC Publications Archive