7 research outputs found

    Annotating topics, stance, argumentativeness and claims in Dutch social media comments : a pilot study

    Get PDF
    One of the major challenges currently facing the field of argumentation mining is the lack of consensus on how to analyse argumentative user-generated texts such as online comments. The theoretical motivations underlying the annotation guidelines used to generate labelled corpora rarely include motivation for the use of a particular theoretical basis. This pilot study reports on the annotation of a corpus of 100 Dutch user comments made in response to politically-themed news articles on Facebook. The annotation covers topic and aspect labelling, stance labelling, argumentativeness detection and claim identification. Our IAA study reports substantial agreement scores for argumentativeness detection (0.76 Fleiss’ kappa) and moderate agreement for claim labelling (0.45 Fleiss’ kappa). We provide a clear justification of the theories and definitions underlying the design of our guidelines. Our analysis of the annotations signal the importance of adjusting our guidelines to include allowances for missing context information and defining the concept of argumentativeness in connection with stance. Our annotated corpus and associated guidelines are made publicly available

    Argumentation Mining in User-Generated Web Discourse

    Full text link
    The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

    Automated analysis of Learner\u27s Research Article writing and feedback generation through Machine Learning and Natural Language Processing

    Get PDF
    Teaching academic writing in English to native and non-native speakers is a challenging task. Quite a variety of computer-aided instruction tools have arisen in the form of Automated Writing Evaluation (AWE) systems to help students in this regard. This thesis describes my contribution towards the implementation of the Research Writing Tutor (RWT), an AWE tool that aids students with academic research writing by analyzing a learner\u27s text at the discourse level. It offers tailored feedback after analysis based on discipline-aware corpora. At the core of RWT lie two different computational models built using machine learning algorithms to identify the rhetorical structure of a text. RWT extends previous research on a similar AWE tool, the Intelligent Academic Discourse Evaluator (IADE) (Cotos, 2010), designed to analyze articles at the move level of discourse. As a result of the present research, RWT analyzes further at the level of discourse steps, which are the granular communicative functions that constitute a particular move. Based on features extracted from a corpus of expert-annotated research article introductions, the learning algorithm classifies each sentence of a document with a particular rhetorical move and a step. Currently, RWT analyzes the introduction section of a research article, but this work generalizes to handle the other sections of an article, including Methods, Results and Discussion/Conclusion. This research describes RWT\u27s unique software architecture for analyzing academic writing. This architecture consists of a database schema, a specific choice of classification features, our computational model training procedure, our approach to testing for performance evaluation, and finally the method of applying the models to a learner\u27s writing sample. Experiments were done on the annotated corpus data to study the relation among the features and the rhetorical structure within the documents. Finally, I report the performance measures of our 23 computational models and their capability to identify rhetorical structure on user submitted writing. The final move classifier was trained using a total of 5828 unigrams and 11630 trigrams and performed at a maximum accuracy of 72.65%. Similarly, the step classifier was trained using a total of 27689 unigrams and 27160 trigrams and performed at a maximum accuracy of 72.01%. The revised architecture presented also led to increased speed of both training (a 9x speedup) and real-time performance (a 2x speedup). These performance rates are sufficient for satisfactory usage of RWT in the classroom. The overall goal of RWT is to empower students to write better by helping them consider writing as a series of rhetorical strategies to convey a functional meaning. This research will enable RWT to be deployed broadly into a wider spectrum of classrooms

    Validation of Score Meaning for the Next Generation of Assessments

    Get PDF
    Despite developments in research and practice on using examinee response process data in assessment design, the use of such data in test validation is rare. Validation of Score Meaning in the Next Generation of Assessments Using Response Processes highlights the importance of validity evidence based on response processes and provides guidance to measurement researchers and practitioners in creating and using such evidence as a regular part of the assessment validation process. Response processes refer to approaches and behaviors of examinees when they interpret assessment situations and formulate and generate solutions as revealed through verbalizations, eye movements, response times, or computer clicks. Such response process data can provide information about the extent to which items and tasks engage examinees in the intended ways. With contributions from the top researchers in the field of assessment, this volume includes chapters that focus on methodological issues and on applications across multiple contexts of assessment interpretation and use. In Part I of this book, contributors discuss the framing of validity as an evidence-based argument for the interpretation of the meaning of test scores, the specifics of different methods of response process data collection and analysis, and the use of response process data relative to issues of validation as highlighted in the joint standards on testing. In Part II, chapter authors offer examples that illustrate the use of response process data in assessment validation. These cases are provided specifically to address issues related to the analysis and interpretation of performance on assessments of complex cognition, assessments designed to inform classroom learning and instruction, and assessments intended for students with varying cultural and linguistic backgrounds