2,154 research outputs found
Using Decision Trees for Coreference Resolution
This paper describes RESOLVE, a system that uses decision trees to learn how
to classify coreferent phrases in the domain of business joint ventures. An
experiment is presented in which the performance of RESOLVE is compared to the
performance of a manually engineered set of rules for the same task. The
results show that decision trees achieve higher performance than the rules in
two of three evaluation metrics developed for the coreference task. In addition
to achieving better performance than the rules, RESOLVE provides a framework
that facilitates the exploration of the types of knowledge that are useful for
solving the coreference problem.Comment: 6 pages; LaTeX source; 1 uuencoded compressed EPS file (separate);
uses ijcai95.sty, named.bst, epsf.tex; to appear in Proc. IJCAI '9
Pattern Matching and Discourse Processing in Information Extraction from Japanese Text
Information extraction is the task of automatically picking up information of
interest from an unconstrained text. Information of interest is usually
extracted in two steps. First, sentence level processing locates relevant
pieces of information scattered throughout the text; second, discourse
processing merges coreferential information to generate the output. In the
first step, pieces of information are locally identified without recognizing
any relationships among them. A key word search or simple pattern search can
achieve this purpose. The second step requires deeper knowledge in order to
understand relationships among separately identified pieces of information.
Previous information extraction systems focused on the first step, partly
because they were not required to link up each piece of information with other
pieces. To link the extracted pieces of information and map them onto a
structured output format, complex discourse processing is essential. This paper
reports on a Japanese information extraction system that merges information
using a pattern matcher and discourse processor. Evaluation results show a high
level of system performance which approaches human performance.Comment: See http://www.jair.org/ for any accompanying file
Evaluating Web Search Result Summaries
The aim of our research is to produce and assess short summaries to aid users’ relevance judgements, for example for a search engine result page. In this paper we present our new metric for measuring summary quality based on representativeness and judgeability, and compare the summary quality of our system to that of Google. We discuss the basis for constructing our evaluation methodology in contrast to previous relevant open evaluations, arguing that the elements which make up an evaluation methodology: the tasks, data and metrics, are interdependent and the way in which they are combined is critical to the effectiveness of the methodology. The paper discusses the relationship between these three factors as implemented in our own work, as well as in SUMMAC/MUC/DUC
Recommended from our members
Lexical patterns, features and knowledge resources for coreference resolution in clinical notes
Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general- purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA). In addition, a method for generating coreference chains using progressively pruned linked lists is demonstrated that reduces the search space and facilitates evaluation by a number of metrics. Independent evaluation results show an F-measure for each corpus of 79.2% and 87.5%, respectively, which offers performance at least as good as human annotators, greatly increased performance over general- purpose tools, and improvement on previously reported clinical coreference systems. The system uses a number of open-source components that are available to download
Recommended from our members
Coreference resolution in clinical discharge summaries, progress notes, surgical and pathology reports: a unified lexical approach
We developed a lexical rule-based system that uses a unified approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA) provided for the fifth i2b2/VA shared task. Taking the unweighted mean between 4 coreference metrics, validation of the system against the i2b2/VA corpus attained an overall F-score of 87.7% across all mention classes, with a maximum of 93.1% for coreference of persons, and a minimum of 77.2% for coreference of tests. For the ODIE corpus the overall F-score across all mention classes was 79.4%, with a maximum of 82.0% for coreference of persons and a minimum of 13.1% for coreference of diagnostic reagents. For the ODIE corpus our results are comparable to the mean reported inter-annotator agreement with the gold standard. We discuss the four categories of errors we identified, and how these might be addressed. The system uses a number of reusable modules and techniques that may be of benefit to the research community
- …