20 research outputs found
Recommended from our members
XIP Dashboard: visual analytics from automated rhetorical parsing of scientific metadiscourse
A key competency that we seek to build in learners is a critical
mind, i.e. ability to engage with the ideas in the literature, and to identify when significant claims are being made in articles. The ability to decode such moves in texts is essential, as is the ability to make such moves in one’s own writing. Computational techniques for extracting them are becoming available, using Natural Language Processing (NLP) tuned to recognize the rhetorical signals that authors use when making a significant scholarly move. After reviewing related NLP work, we introduce the Xerox Incremental Parser (XIP), note previous work to render its output, and then motivate the design of the XIP Dashboard, a set of visual analytics modules built on XIP output, using the LAK/EDM open dataset as a test corpus. We report preliminary user reactions to a paper prototype of such a novel dashboard, describe the visualizations implemented to date, and present user scenarios for learners, educators and researchers. We conclude with a summary of ongoing design refinements, potential platform integrations, and questions that need to be investigated through end-user evaluations
A Framework for Annotating 'Related Works' to Support Feedback to Novice Writers
Understanding what is expected of academic writing can be difficult for novice writers to assimilate, and recent years have seen several automated tools become available to support academic writing. Our work presents a framework for annotating features of the Related Work section of academic writing, that supports writer feedback.Peer reviewe
On the Development and Evaluation of a Brazilian Portuguese Discourse Parser
We present in this paper the development process and the evaluation procedure of a Brazilian Portuguese discourse parser called DiZer. Based on Rhetorical Structure Theory, DiZer is a symbolic cue phrase-based analyzer that makes use of discourse templates learned from a corpus of scientific texts to identify and build the discourse structure of texts. DiZer evaluation shows satisfactory results for scientific and news texts, even tough it was not designed for the latter, which demonstrates DiZer portability.Apresentamos neste artigo o processo de desenvolvimento e avaliação de um analisador discursivo automático para o português brasileiro. Seguindo a Teoria de Estruturação Retórica, o DiZer é um sistema simbólico baseado na ocorrência de marcadores textuais, fazendo uso de templates discursivos extraÃdos de um corpus de textos cientÃficos para identificar a construir a estrutura discursiva de textos. A avaliação do DiZer mostra resultados satisfatórios para textos cientÃficos e jornalÃsticos, apesar do sistema não ter sido delineado para o gênero jornalÃstico, o que demonstra a portabilidade do sistema
Mining arguments in scientific abstracts: Application to argumentative quality assessment
Argument mining consists in the automatic identification of argumentative structures in natural language, a task that has been recognized as particularly challenging in the scientific domain. In this work we propose SciARG, a new annotation scheme, and apply it to the identification of argumentative units and relations in abstracts in two scientific disciplines: computational linguistics and biomedicine, which allows us to assess the applicability of our scheme to different knowledge fields. We use our annotated corpus to train and evaluate argument mining models in various experimental settings, including single and multi-task learning. We investigate the possibility of leveraging existing annotations, including discourse relations and rhetorical roles of sentences, to improve the performance of argument mining models. In particular, we explore the potential offered by a sequential transfer- learning approach in which supplementary training tasks are used to fine-tune pre-trained parameter-rich language models. Finally, we analyze the practical usability of the automatically-extracted components and relations for the prediction of argumentative quality dimensions of scientific abstracts.Agencia Nacional de Investigación e InnovaciónMinisterio de EconomÃa, Industria y Competitividad (España
Recommended from our members
Learning Analytics for Academic Writing through Automatic Identification of Meta-discourse
Effective written communication is an essential skill which promotes educational success for undergraduates. Argumentation is a key requirement of successful writing, which is the most common genre that undergraduates have to write particularly in the social sciences. Therefore, when assessing student writing academic tutors look for students’ ability to present and pursue well-reasoned and strong arguments through scholarly argumentation, which is articulated by meta-discourse.
Today, there are some natural language processing systems which automatically detect authors’ rhetorical moves in scholarly texts. Hence, when assessing their students’ essays, educators could benefit from the available automated textual analysis which can detect meta-discourse. However, previous work has not shown whether these technologies can be used to analyse student writing reliably. The aim of this thesis therefore has been to understand how automated analysis of meta-discourse in student writing can be used to support tutors’ essay assessment practices. This thesis evaluates a particular language analysis tool, the Xerox Incremental Parser (XIP) as an exemplar of this type of automated technology.
The studies presented in this thesis investigates how tutors define the quality of undergraduate writing and suggests key elements that make for good quality student writing in the social sciences, where XIP seems to work best. This thesis also sets out the changes that needs to be made to the XIP and proposes in what ways its output can be delivered to tutors so that they make use of this output to give feedback on student essays.
The findings reported also show problems that academic tutors experience in essay assessment, which potentially could be solved by automated support. However, tutors have preconceptions about the use of automated support.
The study revealed that tutors want to be assured that they retain the ‘power’ themselves in any decision of using automated support to overcome these preconceptions
LitCrit: exploring intentions as a basis for automated feedback on Related Work.
Learning the skill of academic writing is critical for post-graduate (PG) students to
be successful, yet many struggle to master the required standard. Feedback can play a formative role in developing these skills, but many students do not find sufficiently helpful the kinds of feedback available to them. As the Related Work section is known to be particularly difficult for PG students to master that is the focus of this thesis.
To date, models of academic writing have been built on observational studies of
academic articles. In contrast, we carry out a user study to explore what content experts look for in Related Work and how this differs from PG students. We claim that by understanding what experts look for in Related Work and what aspects PG students struggle with, a useful author intention model can be developed to support writing feedback for Related Work sections. Our work demonstrates reliable annotation of the model intentions. Developing on existing algorithms, designed to identify rhetorical intentions in academic writing, we build a supervised machine learning classifier, showing how features focused on Related Work sections improve recognition of content aspects. Carrying out a study to rate the quality of Related Work, we demonstrate that the model is a good proxy for predicting quality, validating the choice of intentions in our model. In addition to recognising author intentions, we automate the generation of feedback based on observations of intentions that are present and missing, taking into account areas that PG students struggle to recognise.
The thesis also contributes a new prototype writing analytic tool, called LitCrit,
that supports visualising the intention narrative of Related Work and presents feedback. We claim this visualisation approach changes the PG student’s perception of Related Work, and demonstrate through a user study that it does draw attention to aspects previously missed bringing PG student responses in line with experts. Finally, we explore the performance of our classifier, originally set within the Computational Linguistics discipline, to that of Computer Graphics. This shows us that while performance may be lower when care is taken to understand those features which are discipline dependent, there is scope for improvement. Also, while a discipline may have the same intentions present in a section, their structural presentation may differ impacting feature choice
Implementación de un software de apoyo a la escritura de resúmenes de textos cientÃficos en español
Desde hace tiempo se viene comentando que los estudiantes universitarios presentan
serios problemas de expresión escrita. En diversas fuentes de información, tales como
artÃculos de investigación cientÃfica, tesis, u otros medios académicos y profesionales,
se puede apreciar diversos errores de redacción. Ésta es una situación que se
considera inadmisible en personas con un alto nivel de instrucción formal,
especialmente porque todas ellas ya han pasado alrededor de once años de
escolarización en la que aprobaron diversas materias relativas a la enseñanza de su
lengua materna.
Como medida para solucionar este problema, se busca promover la enseñanza de la
organización de las ideas. Existen varias técnicas que ayudan a organizar las ideas y
preparar la información antes de la redacción del ensayo, monografÃa o artÃculo
cientÃfico. Una de las técnicas más básicas es la redacción del resumen.
Se sabe que la redacción del resumen de los textos cientÃficos es una técnica básica y
fundamental para la organización de ideas y preparación de información para redactar
correctamente textos cientÃficos más complejos. Por tal motivo, el presente proyecto
de fin de carrera presenta la implementación de un software de apoyo a la escritura de
resúmenes de textos cientÃficos en español, el cual ayudará al escritor a redactar
resúmenes de sus textos cientÃficos con una estructura adecuada.
Para poder llevarlo a cabo, primero se formó un corpus de 44 resúmenes de textos
cientÃficos en español, que sirven para el entrenamiento y prueba del modelo
clasificador AZEsp. Para formar el corpus, se tuvo como estructura óptima de los
textos la presencia de 6 categorÃas: Contexto, Brecha, Propósito, MetodologÃa,
Resultado y Conclusión.
Luego, se procedió a determinar un conjunto de 7 caracterÃsticas (atributos), las cuales
serÃan utilizadas para identificar cada una de las categorÃas. Posteriormente, se
implementaron una serie de algoritmos para la extracción de los valores de dichos
atributos de cada oración de los resúmenes de textos cientÃficos para que sean
utilizadas por el modelo. Una vez obtenidos dichos valores, éstos fueron utilizados
para la implementación del modelo clasificador AZEsp y evaluación de su desempeño
utilizando métricas tales como Precision, Recall y F-Measure.
Finalmente, se implementó el ambiente de ayuda SciEsp, el cual utiliza el modelo
clasificador AZEsp para clasificar automáticamente las oraciones de los resúmenes de
textos cientÃficos en español ingresados por el usuario, siguiendo una estructura predefinida.
Se hizo una serie de experimentos para evaluar el desempeño del modelo clasificador
AZEsp. Se obtuvo diferentes resultados; sin embargo, el más resaltante fue que el
modelo logró un desempeño de 65.4%. Esto demuestra que la herramienta informática
propuesta (SciEsp) está apta para su utilización. En conclusión, los estudiantes
universitarios podrán emplear esta herramienta para la redacción de sus resúmenes;
ellos podrán identificar sus errores y deficiencias en la redacción, y serán capaces de
mejorar de forma autodidacta.Tesi