Pattern Matching and Discourse Processing in Information Extraction from
  Japanese Text

Eriguchi, Y.; Hara, M.; Kitani, T.

research

Pattern Matching and Discourse Processing in Information Extraction from Japanese Text

Authors: Y. Eriguchi
M. Hara
T. Kitani
Publication date: 1 January 1994
Publisher

Abstract

Information extraction is the task of automatically picking up information of interest from an unconstrained text. Information of interest is usually extracted in two steps. First, sentence level processing locates relevant pieces of information scattered throughout the text; second, discourse processing merges coreferential information to generate the output. In the first step, pieces of information are locally identified without recognizing any relationships among them. A key word search or simple pattern search can achieve this purpose. The second step requires deeper knowledge in order to understand relationships among separately identified pieces of information. Previous information extraction systems focused on the first step, partly because they were not required to link up each piece of information with other pieces. To link the extracted pieces of information and map them onto a structured output format, complex discourse processing is essential. This paper reports on a Japanese information extraction system that merges information using a pattern matcher and discourse processor. Evaluation results show a high level of system performance which approaches human performance.Comment: See http://www.jair.org/ for any accompanying file

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.64.39...

Last time updated on 22/10/2014