34 research outputs found

    Using natural language processing to support peer‐feedback in the age of artificial intelligence: A cross‐disciplinary framework and a research agenda

    Get PDF
    Advancements in artificial intelligence are rapidly increasing. The new-generation large language models, such as ChatGPT and GPT-4, bear the potential to transform educational approaches, such as peer-feedback. To investigate peer-feedback at the intersection of natural language processing (NLP) and educational research, this paper suggests a cross-disciplinary framework that aims to facilitate the development of NLP-based adaptive measures for supporting peer-feedback processes in digital learning environments. To conceptualize this process, we introduce a peer-feedback process model, which describes learners' activities and textual products. Further, we introduce a terminological and procedural scheme that facilitates systematically deriving measures to foster the peer-feedback process and how NLP may enhance the adaptivity of such learning support. Building on prior research on education and NLP, we apply this scheme to all learner activities of the peer-feedback process model to exemplify a range of NLP-based adaptive support measures. We also discuss the current challenges and suggest directions for future cross-disciplinary research on the effectiveness and other dimensions of NLP-based adaptive support for peer-feedback. Building on our suggested framework, future research and collaborations at the intersection of education and NLP can innovate peer-feedback in digital learning environments

    Using natural language processing to support peer‐feedback in the age of artificial intelligence: a cross‐disciplinary framework and a research agenda

    Get PDF
    Advancements in artificial intelligence are rapidly increasing. The new-generation large language models, such as ChatGPT and GPT-4, bear the potential to transform educational approaches, such as peer-feedback. To investigate peer-feedback at the intersection of natural language processing (NLP) and educational research, this paper suggests a cross-disciplinary framework that aims to facilitate the development of NLP-based adaptive measures for supporting peer-feedback processes in digital learning environments. To conceptualize this process, we introduce a peer-feedback process model, which describes learners' activities and textual products. Further, we introduce a terminological and procedural scheme that facilitates systematically deriving measures to foster the peer-feedback process and how NLP may enhance the adaptivity of such learning support. Building on prior research on education and NLP, we apply this scheme to all learner activities of the peer-feedback process model to exemplify a range of NLP-based adaptive support measures. We also discuss the current challenges and suggest directions for future cross-disciplinary research on the effectiveness and other dimensions of NLP-based adaptive support for peer-feedback. Building on our suggested framework, future research and collaborations at the intersection of education and NLP can innovate peer-feedback in digital learning environments

    Multilingual Grammatical Error Detection And Its Applications to Prompt-Based Correction

    Get PDF
    Grammatical Error Correction (GEC) and Grammatical Error Correction (GED) are two important tasks in the study of writing assistant technologies. Given an input sentence, the former aims to output a corrected version of the sentence, while the latter's goal is to indicate in which words of the sentence errors occur. Both tasks are relevant for real-world applications that help native speakers and language learners to write better. Naturally, these two areas have attracted the attention of the research community and have been studied in the context of modern neural networks. This work focuses on the study of multilingual GED models and how they can be used to improve GEC performed by large language models (LLMs). We study the difference in performance between GED models trained in a single language and models that undergo multilingual training. We expand the list of datasets used for multilingual GED to further experiment with cross-dataset and cross-lingual generalization of detection models. Our results go against previous findings and indicate that multilingual GED models are as good as monolingual ones when evaluated in the in-domain languages. Furthermore, multilingual models show better generalization to novel languages seen only at test time. Making use of the GED models we study, we propose two methods to improve corrections of prompt-based GEC using LLMs. The first method aims to mitigate overcorrection by using a detection model to determine if a sentence has any mistakes before feeding it to the LLM. The second method uses the sequence of GED tags to select the in-context examples provided in the prompt. We perform experiments in English, Czech, German and Russian, using Llama2 and GPT3.5. The results show that both methods increase the performance of prompt-based GEC and point to a promising direction of using GED models as part of the correction pipeline performed by LLMs

    Quantifying the impact of Twitter activity in political battlegrounds

    Get PDF
    It may be challenging to determine the reach of the information, how well it corresponds with the domain design, and how to utilize it as a communication medium when utilizing social media platforms, notably Twitter, to engage the public in advocating a parliament act, or during a global health emergency. Chapter 3 offers a broad overview of how candidates running in the 2020 US Elections used Twitter as a communication tool to interact with voters. More precisely, it seeks to identify components related to internal collaboration and public participation (in terms of content and stance similarity among the candidates from the same political front and to the official Twitter accounts of their political parties). The 2020 US Presidential and Vice Presidential candidates from the two main political parties, the Republicans and Democrats, are our main subjects. Along with the content similarity, their tweets were assessed for social reach and stance similarity on 22 topics. This study complements previous research on efficiently using social media platforms for election campaigns. Chapter 4 empirically examines the online social associations of the top-10 COVID-19 resilient nations’ leaders and healthcare institutions based on the Bloomberg COVID-19 Resilience Ranking. In order to measure the strength of the online social association in terms of public engagement, sentiment strength, inclusivity and diversity, we used the attributes provided by Twitter Academic Research API, coupled with the tweets of leaders and healthcare organizations from these nations. Understanding how leaders and healthcare organizations may utilize Twitter to establish digital connections with the public during health emergencies is made more accessible by this study. The thesis has proposed methods for efficiently using Twitter in various domains, utilizing the implementations of various Language Models and several data mining and analytics techniques

    Human evaluation and statistical analyses on machine reading comprehension, question generation and open-domain dialogue

    Get PDF
    Evaluation is a critical element in the development process of many natural language based systems. In this thesis, we will present critical analyses of standard evaluation methodologies applied in the following Natural Language Processing (NLP) domains: machine reading comprehension (MRC), question generation (QG), and open-domain dialogue. Generally speaking, systems from tasks like MRC are usually evaluated by comparing the similarity between hand-crafted references and system generated outputs using automatic evaluation metrics, thus these metrics are mainly borrowed from other NLP tasks that have been well-developed, such as machine translation and text summarization. Meanwhile, the evaluation of QG and dialogues is even a known open problem as such tasks do not have the corresponding references for computing the similarity, and human evaluation is indispensable when assessing the performance of the systems from these tasks. However, human evaluation is unfortunately not always valid because: i) human evaluation may cost too much and be hard to deploy when experts are involved; ii) human assessors can lack reliability in the crowd-sourcing environment. To overcome the challenges from both automatic metrics and human evaluation, we first design specific crowdsourcing human evaluation methods for these three target tasks, respectively. We then show that these human evaluation methods are reproducible, highly reliable, easy to deploy, and cost-effective. Additionally, with the data collected from our experiments, we measure the accuracy of existing automatic metrics and analyse the potential limitations and disadvantages of the direct application of these metrics. Furthermore, in allusion to the specific features of different tasks, we provide detailed statistical analyses on the collected data to discover their underlying trends, and further give suggestions about the directions to improving systems on different aspects

    Incremental Disfluency Detection for Spoken Learner English

    Get PDF
    Dialogue-based computer-assisted language learning (CALL) concerns the application and analysis of automated systems that engage with a language learner through dialogue. Routed in an interactionist perspective of second language acquisition, dialogue-based CALL systems assume the role of a speaking partner, providing learners the opportunity for spontaneous production of their second language. One area of interest for such systems is the implementation of corrective feedback. However, the feedback strategies employed by such systems remain fairly limited. In particular, there are currently no provisions for learners to initiate the correction of their own errors, despite this being the most frequently occurring and most preferred type of error correction in learner speech. To address this gap, this thesis proposes a framework for implementing such functionality, identifying incremental self-initiated self-repair (i.e. disfluency) detection as a key area for research. Taking an interdisciplinary approach to the exploration of this topic, this thesis outlines the steps taken to optimise an incremental disfluency detection model for use with spoken learner English. To begin, a linguistic comparative analysis of native and learner disfluency corpora explored the differences between the disfluency behaviour of native and learner speech, highlighting key features of learner speech not previously explored in disfluency detection model analysis. Following this, in order to identify a suitable baseline model for further experimentation, two state-of-the-art incremental self-repair detection models were trained and tested with a learner speech corpus. An error analysis of the models' outputs found an LSTM model using word embeddings and part-of-speech tags to be the most suitable for learner speech, thanks to its lower number of false positives triggered by learner errors in the corpus. Following this, several adaptations to the model were tested to improve performance. Namely, the inclusion of character embeddings, silence and laughter features, separating edit term detection from disfluency detection, lemmatization and the inclusion of learners' prior proficiency scores led to over an eight percent model improvement over the baseline. Findings from this thesis illustrate how the analysis of language characteristics specific to learner speech can positively inform model adaptation and provide a starting point for further investigation into the implementation of effective corrective feedback strategies in dialogue-based CALL systems

    Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018 : 10-12 December 2018, Torino

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after five years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges

    Information Access Using Neural Networks For Diverse Domains And Sources

    Get PDF
    The ever-increasing volume of web-based documents poses a challenge in efficiently accessing specialized knowledge from domain-specific sources, requiring a profound understanding of the domain and substantial comprehension effort. Although natural language technologies, such as information retrieval and machine reading compression systems, offer rapid and accurate information retrieval, their performance in specific domains is hindered by training on general domain datasets. Creating domain-specific training datasets, while effective, is time-consuming, expensive, and heavily reliant on domain experts. This thesis presents a comprehensive exploration of efficient technologies to address the challenge of information access in specific domains, focusing on retrieval-based systems encompassing question answering and ranking. We begin with a comprehensive introduction to the information access system. We demonstrated the structure of a information access system through a typical open-domain question-answering task. We outline its two major components: retrieval and reader models, and the design choice for each part. We focus on mainly three points: 1) the design choice of the connection of the two components. 2) the trade-off associated with the retrieval model and the best frontier in practice. 3) a data augmentation method to adapt the reader model, trained initially on closed-domain datasets, to effectively answer questions in the retrieval-based setting. Subsequently, we discuss various methods enabling system adaptation to specific domains. Transfer learning techniques are presented, including generation as data augmentation, further pre-training, and progressive domain-clustered training. We also present a novel zero-shot re-ranking method inspired by the compression-based distance. We summarize the conclusions and findings gathered from the experiments. Moreover, the exploration extends to retrieval-based systems beyond textual corpora. We explored the search system for an e-commerce database, wherein natural language queries are combined with user preference data to facilitate the retrieval of relevant products. To address the challenges, including noisy labels and cold start problems, for the retrieval-based e-commerce ranking system, we enhanced model training through cascaded training and adversarial sample weighting. Another scenario we investigated is the search system in the math domain, characterized by the unique role of formulas and distinct features compared to textual searches. We tackle the math related search problem by combining neural ranking models with structual optimized algorithms. Finally, we summarize the research findings and future research directions
    corecore