Search CORE

416 research outputs found

Using the Annotated Bibliography as a Resource for Indicative Summarization

Author: Kan Min-Yen
Klavans Judith L.
McKeown Kathleen R.
Publication venue
Publication date: 01/01/2002
Field of study

We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated bibliographies cover certain aspects of summarization that have not been well-covered by other summary corpora, and motivate why they constitute an important form to study for information retrieval. We detail our methodology for collecting the corpus, and overview our document feature markup that we introduced to facilitate summary analysis. We present the characteristics of the corpus, methods of collection, and show its use in finding the distribution of types of information included in indicative summaries and their relative ordering within the summaries.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Columbia University Academic Commons

Resources for Evaluation of Summarization Techniques

Author: Kan Min-Yen
Klavans Judith L.
Lee Susan
McKeown Kathleen R.
Publication venue
Publication date: 01/01/1998
Field of study

We report on two corpora to be used in the evaluation of component systems for the tasks of (1) linear segmentation of text and (2) summary-directed sentence extraction. We present characteristics of the corpora, methods used in the collection of user judgments, and an overview of the application of the corpora to evaluating the component system. Finally, we discuss the problems and issues with construction of the test set which apply broadly to the construction of evaluation resources for language technologies.Comment: LaTeX source, 5 pages, US Letter, uses lrec98.st

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Fathering in Joint Custody Families: A Study of Divorced and Remarried Fathers

Author: Simring A. Sue Klavans
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1984
Field of study

This research explored the fathering experience of 44 divorced and remarried fathers with legal joint custody and at least one child under the age of 16. The fathers filled out a questionnaire and were interviewed about the frequency of their participation in various child care activities, their satisfaction during their participation in these activities, and their perceived influence on their child's growth and development. Three fathering measures were derived from the questionnaire. The father's perception of the relationship with the mother (coparenting relationship) was correlated with the fathering measures to determine if the amount of interaction between coparents and the amount of support or conflict in their relationship was associated with high or low scores on the fathering measures. Results indicate that the sample fathers have maintained an active and involved relationship with their children which did not diminish upon remarriage. They are satisfied with the time spent with their child, and feel influential in their child's growth and development. The quality of the relationship between coparents varied from highly supportive relationships to highly conflictual and antagonistic ones. In general, the amount of support or conflict within the coparental relationship, and the frequency of the coparental interaction, was not associated with any of the indicators of a father's involvement with his child. Fathers were able to sustain an involvement with their children without support from their former wives and within conflictual circumstances. Joint custody was considered to be the context within which fathers were able to negotiate a positive relationship with their child. Most fathers were strongly in favor of using the legal supports that are part of a joint custody agreement as a means of insuring both parents' attachment to their child after divorce. Joint custody appears to be an appropriate and desirable child care alternative in more kinds of divorced families than is currently accepted or encouraged. However, far more support from the legal and social systems is needed to help fathers continue to fulfill their responsibilities and obligations as parents after separation, divorce and remarriage

Columbia University Academic Commons

Dynamic Studies of the Scientific Strengths of Nations Using a Highly Detailed Model of Science

Author: Boyack Kevin W.
Klavans Richard
Publication venue: Georgia Institute of Technology
Publication date: 02/10/2009
Field of study

Atlanta Conference on Science and Innovation Policy 2009This presentation was part of the session : Methods, Measures, and Dat

Scholarly Materials And Research @ Georgia Tech

Recommended from our members

Evaluation of the DEFINDER System for Fully Automatic Glossary Construction

Author: Klavans Judith L.
Muresan Smaranda
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2001
Field of study

In this paper we present a quantitative and qualitative evaluation of DEFINDER, a rule-based system that mines consumer-oriented full text articles in order to extract definitions and the terms they define. The quantitative evaluation shows that in terms of precision and recall as measured against human performance, DEFINDER obtained 87% and 75% respectively, thereby revealing the incompleteness of existing resources and the ability of DEFINDER to address these gaps. Our basis for comparison is definitions from on-line dictionaries, including the UMLS Metathesaurus. Qualitative evaluation shows that the definitions extracted by our system are ranked higher in terms of user-centered criteria of usability and readability than are definitions from on-line specialized dictionaries. The output of DEFINDER can be used to enhance these dictionaries. DEFINDER output is being incorporated in a system to clarify technical terms for non-specialist users in understandable non-technical language

Columbia University Academic Commons

PubMed Central

Recommended from our members

Evaluation of DEFINDER: A System to Mine Definitions from Consumer-oriented Medical Text

Author: Klavans Judith L.
Muresan Smaranda
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2001
Field of study

In this paper we present DEFINDER, a rule-based system that mines cons umer-oriented full text articles in order to extract definitions and the terms they define. This research is part of Digital Library Project at Columbia University, entitled PERSIVAL (PErsonalized Retrieval and Summarization of Image, Video and Language resources). One goal of the project is to present information to patients in language they can understand. A key component of this stage is to provide accurate and readable lay definitions for technical terms, which may be present in articles of intermediate complexity. The focus of this short paper is on quantitative and qualitative evaluation of the DEFINDER system. Our basis for comparison was definitions from Unified Medical Language System (UMLS), On-line Medical Dictionary (OMD) and Glossary of Popular and Technical Medical Terms (GPTMT). Quantitative evaluations show that DEFINDER obtained 87% precision and 75% recall and reveal the incompleteness of existing resources and the ability of DEFINDER to address gaps. Qualitative evaluation shows that the definitions extracted by our system are ranked higher in terms of user-based criteria of usability and readability than definitions from on-line specialized dictionaries. Thus the output of DEFINDER can be used to enhance existing specialized dictionaries, and also as a key feature in summarizing technical articles for non-specialist users

Columbia University Academic Commons

Recommended from our members

A method for automatically building and evaluating dictionary resources

Author: Klavans Judith L.
Muresan Smaranda
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2002
Field of study

This paper describes a method toward automatically building dictionaries from text. We present DEFINDER, a rule-based system for extraction of definitions from on-line consumer-oriented medical articles. We provide an extensive evaluation on three dimensions: i) performance of the definition extraction technique in terms of precision and recall, ii) quality of the built dictionary as judged both by specialists and lay users, iii) coverage of existing on-line dictionaries. The corpus we used for the study is publicly available. A major contribution of the paper is the range of quantitative and qualitative evaluation methods

Columbia University Academic Commons

Recommended from our members

Tackling the Internet Glossary Glut: Automatic Extraction and Evaluation of Genus Phrases

Author: Klavans Judith L.
Passonneau Rebecca
Popper Samuel
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2003
Field of study

This paper addresses the problem of developing methods to be used in the identification and extraction of meaningful semantic components from large online glossaries. We present two sets of results. First, we report on the algorithm, ParseGloss, which was used to analyze definitions, and extract the main concept, or genus phrase. We ran the system on over 12,000 online glossary entries. Second, we present a method to evaluate our results, using human judgments on a collection of definitions from six different sources. This paper discusses our approach to the evaluation process, since the creation of a standard for evaluation is in itself a contribution to the field. The methods we have developed have required addressing the significant challenges of abstracting a single gold standard from multiple naive, human judgments on a highly subjective task. Once the method for creating the standard was developed, we then established the gold standard data. We report on our performance in running ParseGloss over this controlled collection of definitions. Our first set of results presents precision and recall on system performance. Our second results are presented in terms of techniques for determining agreement between human subjects. Success in the ParseGloss algorithm will contribute to the automatic creation of ontologies

Columbia University Academic Commons

Recommended from our members

GIST-IT: Summarizing Email Using Linguistic Knowledge and Machine

Author: Klavans Judith L.
Muresan Smaranda
Tzoukermann Evelyne
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2001
Field of study

We present a system for the automatic extraction of salient information from email messages, thus providing the gist of their meaning. Dealing with email raises several challenges that we address in this paper: heterogeneous data in terms of length and topic. Our method combines shallow linguistic processing with machine learning to extract phrasal units that are representative of email content. The GIST-IT application is fully implemented and embedded in an active mailbox platform. Evaluation was performed over three machine learning paradigms

Columbia University Academic Commons