3 research outputs found

    ANNOTATING A CORPUS OF BIOMEDICAL RESEARCH TEXTS: TWO MODELS OF RHETORICAL ANALYSIS

    Get PDF
    Recent advances in the biomedical sciences have led to an enormous increase in the amount of research literature being published, most of it in electronic form; researchers are finding it difficult to keep up-to-date on all of the new developments in their fields. As a result there is a need to develop automated Text Mining tools to filter and organize data in a way which is useful to researchers. Human-annotated data are often used as the ‘gold standard’ to train such systems via machine learning methods. This thesis reports on a project where three annotators applied two Models of rhetoric (argument) to a corpus of on-line biomedical research texts. How authors structure their argumentation and which rhetorical strategies they employ are key to how researchers present their experimental results; thus rhetorical analysis of a text could allow for the extraction of information which is pertinent for a particular researcher’s purpose. The first Model stems from previous work in Computational Linguistics; it focuses on differentiating ‘new’ from ‘old’ information, and results from analysis of results. The second Model is based on Toulmin’s argument structure (1958/2003); its main focus is to identify ‘Claims’ being made by the authors, but it also differentiates between internal and external evidence, as well as categories of explanation and implications of the current experiment. In order to properly train automated systems, and as a gauge of the shared understanding of the argument scheme being applied, inter-annotator agreement should be relatively high. The results of this study show complete (three-way) inter-annotator agreement on an average of 60.5% of the 400 sentences in the final corpus under Model 1, and 39.3% under Model 2. Analyses of the inter-annotator variation are done in order to examine in detail all of the factors involved; these include particular Model categories, individual annotator preferences, errors, and the corpus data itself. In order to reduce this inter­ annotator variation, revisions to both Models are suggested; also it is recommended that in the future biomedical domain experts, possibly in tandem with experts in rhetoric, be used as annotators

    ANNOTATING A CORPUS OF BIOMEDICAL RESEARCH TEXTS: TWO MODELS OF RHETORICAL ANALYSIS

    Get PDF
    Recent advances in the biomedical sciences have led to an enormous increase in the amount of research literature being published, most of it in electronic form; researchers are finding it difficult to keep up-to-date on all of the new developments in their fields. As a result there is a need to develop automated Text Mining tools to filter and organize data in a way which is useful to researchers. Human-annotated data are often used as the ‘gold standard’ to train such systems via machine learning methods. This thesis reports on a project where three annotators applied two Models of rhetoric (argument) to a corpus of on-line biomedical research texts. How authors structure their argumentation and which rhetorical strategies they employ are key to how researchers present their experimental results; thus rhetorical analysis of a text could allow for the extraction of information which is pertinent for a particular researcher’s purpose. The first Model stems from previous work in Computational Linguistics; it focuses on differentiating ‘new’ from ‘old’ information, and results from analysis of results. The second Model is based on Toulmin’s argument structure (1958/2003); its main focus is to identify ‘Claims’ being made by the authors, but it also differentiates between internal and external evidence, as well as categories of explanation and implications of the current experiment. In order to properly train automated systems, and as a gauge of the shared understanding of the argument scheme being applied, inter-annotator agreement should be relatively high. The results of this study show complete (three-way) inter-annotator agreement on m an average of 60.5% of the 400 sentences in the final corpus under Model 1, and 39.3% under Model 2. Analyses of the inter-annotator variation are done in order to examine in detail all of the factors involved; these include particular Model categories, individual annotator preferences, errors, and the corpus data itself. In order to reduce this inter­ annotator variation, revisions to both Models are suggested; also it is recommended that in the future biomedical domain experts, possibly in tandem with experts in rhetoric, be used as annotators

    Persuasive and adaptive tutorial dialogues for a medical diagnosis tutoring system

    Get PDF
    The objective of this thesis is to address a key problem in the development of an intelligent tutoring system, that is, the implementation of the verbal exchange (a dialogue) that takes place between a student and the system. Here we consider TeachMed, a medical diagnosis tutoring system that teaches the students to diagnose clinical problems. However, approaches that are presented could also fit other tutoring systems. In such a system, a dialogue must be implemented that determines when and how pedagogic aid is provided to the student, that is, what to say to her, in what circumstances, and how to say it. Finite state machines and automated planning systems are so far the two most common approaches for implementing tutoring dialogues in intelligent tutoring systems. In the former approach, finite state machines of dialogues are manually designed and hard coded in intelligent tutoring systems. This is a straightforward but very time consuming approach. Furthermore, any change or extension to the hard coded finite state machines is very difficult as it requires reprogramming the system. On the other hand, automated planning has long been presented as a promising technique for automatic dialogue generating. However, in existing approaches, the requirement for the system to persuade the student is not formally acknowledged. Moreover, current dialogue planning approaches are not able to reason on uncertainties about the student's knowledge. This thesis presents two approaches for generating more effective tutorial dialogues.The first approach describes an argumentation framework for implementing persuasive tutoring dialogues. In this approach the entire interaction between the student and the tutoring system is seen as argumentation.The tutoring system and the student can settle conflicts arising during their argumentation by accepting, challenging, or questioning each other's arguments or withdrawing their own arguments. Pedagogic strategies guide the tutoring system by selecting arguments aimed at convincing the student.The second approach presents a non-deterministic planning technique which models the dialogue generation problem as one of planning with incomplete knowledge and sensing. This approach takes into account incomplete information about a particular fact of the student's knowledge by creating conditional branches in a dialogue plan such that each branch represents an adaptation of the dialogue plan with respect to a particular state of the student's knowledge or belief concerning the desired fact. In order to find out the real state of the student's knowledge and to choose the right branch at execution time, the planner includes some queries in the dialogue plan so that the tutoring system can ask the student to gather missing information. One contribution in this thesis is improving the quality of tutoring dialogues by engaging students in argumentative interactions and/or adapting the dialogues with respect to the student's knowledge. Another one is facilitating the design and implementation of tutoring by turning to automatically generated dialogues as opposed to manually generated ones
    corecore