170 research outputs found
Recommended from our members
XIP Dashboard: visual analytics from automated rhetorical parsing of scientific metadiscourse
A key competency that we seek to build in learners is a critical
mind, i.e. ability to engage with the ideas in the literature, and to identify when significant claims are being made in articles. The ability to decode such moves in texts is essential, as is the ability to make such moves in one’s own writing. Computational techniques for extracting them are becoming available, using Natural Language Processing (NLP) tuned to recognize the rhetorical signals that authors use when making a significant scholarly move. After reviewing related NLP work, we introduce the Xerox Incremental Parser (XIP), note previous work to render its output, and then motivate the design of the XIP Dashboard, a set of visual analytics modules built on XIP output, using the LAK/EDM open dataset as a test corpus. We report preliminary user reactions to a paper prototype of such a novel dashboard, describe the visualizations implemented to date, and present user scenarios for learners, educators and researchers. We conclude with a summary of ongoing design refinements, potential platform integrations, and questions that need to be investigated through end-user evaluations
Are You Being Rhetorical? A Description of Rhetorical Move Annotation Tools and Open Corpus of Sample Machine-Annotated Rhetorical Moves
Writing analytics has emerged as a sub-field of learning analytics, with applications including the provision of formative feedback to students in developing their writing capacities. Rhetorical markers in writing have become a key feature in this feedback, with a number of tools being developed across research and teaching contexts. However, there is no shared corpus of texts annotated by these tools, nor is it clear how the tool annotations compare. Thus, resources are scarce for comparing tools for both tool development and pedagogic purposes. In this paper, we conduct such a comparison and introduce a sample corpus of texts representative of the particular genres, a subset of which has been annotated using three rhetorical analysis tools (one of which has two versions). This paper aims to provide both a description of the tools and a shared dataset in order to support extensions of existing analyses and tool design in support of writing skill development. We intend the description of these tools, which share a focus on rhetorical structures, alongside the corpus, to be a preliminary step to enable further research, with regard to both tool development and tool interaction</jats:p
Mining arguments in scientific abstracts: Application to argumentative quality assessment
Argument mining consists in the automatic identification of argumentative structures in natural language, a task that has been recognized as particularly challenging in the scientific domain. In this work we propose SciARG, a new annotation scheme, and apply it to the identification of argumentative units and relations in abstracts in two scientific disciplines: computational linguistics and biomedicine, which allows us to assess the applicability of our scheme to different knowledge fields. We use our annotated corpus to train and evaluate argument mining models in various experimental settings, including single and multi-task learning. We investigate the possibility of leveraging existing annotations, including discourse relations and rhetorical roles of sentences, to improve the performance of argument mining models. In particular, we explore the potential offered by a sequential transfer- learning approach in which supplementary training tasks are used to fine-tune pre-trained parameter-rich language models. Finally, we analyze the practical usability of the automatically-extracted components and relations for the prediction of argumentative quality dimensions of scientific abstracts.Agencia Nacional de Investigación e InnovaciónMinisterio de Economía, Industria y Competitividad (España
Are You Being Rhetorical? An Open Corpus of Machine Annotated Rhetorical Moves
Writing analytics has emerged as a sub-field of learning analytics, with applications including the provision of formative feedback to students in developing their writing capacities. Rhetorical markers in writing have become a key feature in this feedback, with a number of tools being developed across research and teaching contexts. However, there is no shared corpus of texts annotated by these tools, nor is it clear how the tool annotations compare. Thus, resources are scarce for comparing tools for both tool development and pedagogic purposes. In this paper, we conduct such a comparison and introduce a sample corpus of texts representative of the particular genres, a subset of which has been annotated using three rhetorical analysis tools (one of which has two versions). This paper aims to provide both a description of the tools and a shared dataset in order to support extensions of existing analyses and tool design in support of writing skill development. We intend the description of these tools, which share a focus on rhetorical structures, alongside the corpus, to be a preliminary step to enable further research, with regard to both tool development and tool interaction
Developing resources for sentiment analysis of informal Arabic text in social media
Natural Language Processing (NLP) applications such as text categorization, machine translation, sentiment analysis, etc., need annotated corpora and lexicons to check quality and performance. This paper describes the development of resources for sentiment analysis specifically for Arabic text in social media. A distinctive feature of the corpora and lexicons developed are that they are determined from informal Arabic that does not conform to grammatical or spelling standards. We refer to Arabic social media content of this sort as Dialectal Arabic (DA) - informal Arabic originating from and potentially mixing a range of different individual dialects. The paper describes the process adopted for developing corpora and sentiment lexicons for sentiment analysis within different social media and their resulting characteristics. The addition to providing useful NLP data sets for Dialectal Arabic the work also contributes to understanding the approach to developing corpora and lexicons
Argument mining: A machine learning perspective
Argument mining has recently become a hot topic, attracting the interests of several and diverse research communities, ranging from artificial intelligence, to computational linguistics, natural language processing, social and philosophical sciences. In this paper, we attempt to describe the problems and challenges of argument mining from a machine learning angle. In particular, we advocate that machine learning techniques so far have been under-exploited, and that a more proper standardization of the problem, also with regards to the underlying argument model, could provide a crucial element to develop better systems
Corpora for sentiment analysis of Arabic text in social media
Different Natural Language Processing (NLP) applications such as text categorization, machine translation, etc., need annotated corpora to check quality and performance. Similarly, sentiment analysis requires annotated corpora to test the performance of classifiers. Manual annotation performed by native speakers is used as a benchmark test to measure how accurate a classifier is. In this paper we summarise currently available Arabic corpora and describe work in progress to build, annotate, and use Arabic corpora consisting of Facebook (FB) posts. The distinctive nature of thesecorpora is that it is based on posts written in Dialectal Arabic (DA) not following specific grammatical or spelling standards. The corpora are annotated with five labels (positive, negative, dual, neutral, and spam). In addition to building the corpus, the paper illustrates how manual tagging can be used to extract opinionated words and phrases to be used in a lexicon-based classifier
Recommended from our members
Learning Analytics for Academic Writing through Automatic Identification of Meta-discourse
Effective written communication is an essential skill which promotes educational success for undergraduates. Argumentation is a key requirement of successful writing, which is the most common genre that undergraduates have to write particularly in the social sciences. Therefore, when assessing student writing academic tutors look for students’ ability to present and pursue well-reasoned and strong arguments through scholarly argumentation, which is articulated by meta-discourse.
Today, there are some natural language processing systems which automatically detect authors’ rhetorical moves in scholarly texts. Hence, when assessing their students’ essays, educators could benefit from the available automated textual analysis which can detect meta-discourse. However, previous work has not shown whether these technologies can be used to analyse student writing reliably. The aim of this thesis therefore has been to understand how automated analysis of meta-discourse in student writing can be used to support tutors’ essay assessment practices. This thesis evaluates a particular language analysis tool, the Xerox Incremental Parser (XIP) as an exemplar of this type of automated technology.
The studies presented in this thesis investigates how tutors define the quality of undergraduate writing and suggests key elements that make for good quality student writing in the social sciences, where XIP seems to work best. This thesis also sets out the changes that needs to be made to the XIP and proposes in what ways its output can be delivered to tutors so that they make use of this output to give feedback on student essays.
The findings reported also show problems that academic tutors experience in essay assessment, which potentially could be solved by automated support. However, tutors have preconceptions about the use of automated support.
The study revealed that tutors want to be assured that they retain the ‘power’ themselves in any decision of using automated support to overcome these preconceptions
- …