973 research outputs found

    Considerations when using an Automatic Grading System within Computer Science Modules

    Get PDF
    [EN] This paper aims to investigate the effectiveness of automatic grading systems, with a focus on their uses within Computer Science. Automatic grading systems have seen a rise in popularity in recent years with publications concerning automatic grading systems usually linked to a specific system. This paper will discuss the factors that need to be considered when using automatic grading, regardless of which system is being used, and will make recommendations for each factor. This discussion is based on the authors' experience of using an automatic grading system in a CS1 environment. From the research conducted, many elements should be considered when using these systems. These include how the code will be tested, the need for plagiarism checks and how marks are awarded. The findings of this study suggest there is a lack of defined standards when using these systems. This analysis of the considerations provides valuable insight into how these systems should be used and what the standards should be built on.Thompson, A.; Mooney, A.; Noone, M.; Hegarty-Kelly, E. (2021). Considerations when using an Automatic Grading System within Computer Science Modules. En 7th International Conference on Higher Education Advances (HEAd'21). Editorial Universitat Politècnica de València. 589-597. https://doi.org/10.4995/HEAd21.2021.13045OCS58959

    Practical and Ethical Challenges of Large Language Models in Education: A Systematic Scoping Review

    Full text link
    Educational technology innovations leveraging large language models (LLMs) have shown the potential to automate the laborious process of generating and analysing textual content. While various innovations have been developed to automate a range of educational tasks (e.g., question generation, feedback provision, and essay grading), there are concerns regarding the practicality and ethicality of these innovations. Such concerns may hinder future research and the adoption of LLMs-based innovations in authentic educational contexts. To address this, we conducted a systematic scoping review of 118 peer-reviewed papers published since 2017 to pinpoint the current state of research on using LLMs to automate and support educational tasks. The findings revealed 53 use cases for LLMs in automating education tasks, categorised into nine main categories: profiling/labelling, detection, grading, teaching support, prediction, knowledge representation, feedback, content generation, and recommendation. Additionally, we also identified several practical and ethical challenges, including low technological readiness, lack of replicability and transparency, and insufficient privacy and beneficence considerations. The findings were summarised into three recommendations for future studies, including updating existing innovations with state-of-the-art models (e.g., GPT-3/4), embracing the initiative of open-sourcing models/systems, and adopting a human-centred approach throughout the developmental process. As the intersection of AI and education is continuously evolving, the findings of this study can serve as an essential reference point for researchers, allowing them to leverage the strengths, learn from the limitations, and uncover potential research opportunities enabled by ChatGPT and other generative AI models

    Making the Most of Repetitive Mistakes: An Investigation into Heuristics for Selecting and Applying Feedback to Programming Coursework

    Get PDF
    In the acquisition of software-development skills, feedback that pinpoints errors and explains means of improvement is important in achieving a good student learning experience. However, it is not feasible to manually provide timely, consistent, and helpful feedback for large or complex coursework tasks, and/or to large cohorts of students. While tools exist to provide feedback to student submissions, their automation is typically limited to reporting either test pass or failure or generating feedback to very simple programming tasks. Anecdotal experience indicates that clusters of students tend to make similar mistakes and/or successes within their coursework. Do feedback comments applied to students' work support this claim and, if so, to what extent is this the case? How might this be exploited to improve the assessment process and the quality of feedback given to students? To help answer these questions, we have examined feedback given to coursework submissions to a UK level 5, university-level, data structures and algorithms course to determine heuristics used to trigger particular feedback comments that are common between submissions and cohorts. This paper reports our results and discusses how the identified heuristics may be used to promote timeliness and consistency of feedback without jeopardising the quality

    Reimagining Communication Studies in a Digital Age

    Get PDF
    The COVID-19 pandemic challenged society in many ways. In schools and universities, classrooms and campuses were vacated as teaching moved online. In this massive, forced leap of digitalization, the flexibility of digital solutions was harnessed on a large scale, leading to formal education continuing despite the lockdowns. However, despite technology providing possibilities to teach remotely, the overall organization of courses was still stuck in a bureaucratic system not built on the flexible affordances of the digital age. This master’s thesis presents an alternative way of organizing university studies, through the presentation and testing of the Communication Studies Tracker (CST). The CST is designed to allow students to complete their obligatory communications studies at university without having to attend any specific language or communication courses. Instead, the system would track and process communicative tasks done by students, until a sufficient amount of successful experience is gathered in the required languages and focus areas. To gauge the viability of an approach based on the above concept, a usability test of the CST was organized, in which five students and five teachers tested and discussed the system and its underlying concept. The results show participants felt that the CST would provide increased flexibility and meaningfulness in communication studies. Further, participants felt the concept was viable and suitable for implementing at the University of Turku. The main challenges discovered were related to support for weaker students and coordination of group work. The teacher testers also expressed concern regarding how a course-free system would be implemented by the university administration, especially concerning resource allocation.Covid-19-pandemia haastoi yhteiskuntamme monin tavoin. Kouluissa ja yliopistoissa luokkahuoneet ja kampukset tyhjenivät, kun opetus siirtyi verkkoon. Tässä massiivisessa pakotetussa digitalisoinnin harppauksessa digitaalisten ratkaisujen joustavuutta hyödynnettiin laajamittaisesti, mikä johti muodollisen koulutuksen jatkumiseen tilojen suluista huolimatta. Vaikka tekniikka pystyy tarjoamaan opetusta etäyhteyden välityksellä, kurssien yleinen rakenne juontaa juurensa byrokraattisesta järjestelmästä, joka ei perustu digitaalisen aikakauden joustaviin mahdollisuuksiin. Tämä tutkielma esittää vaihtoehtoisen tavan järjestää yliopisto-opintoja esittelemällä ja testaamalla Communication Studies Tracker (CST) järjestelmän. CST on suunniteltu antamaan opiskelijoille mahdollisuus suorittaa yliopisto-tutkinnon pakolliset viestintäopinnot käymättä mitään erityisiä kieli- tai viestintäkursseja. Sen sijaan järjestelmä seuraa ja käsittelee opiskelijoiden tekemiä kommunikaatiotehtäviä, kunnes vaadittavilla kielillä ja kohdealueilla on kerätty riittävä määrä hyväksyttyjä kokemuksia. Yllä olevaan konseptiin perustuvan lähestymistavan toimivuuden arvioimiseksi järjestettiin CST:n käytettävyystesti, jossa viisi opiskelijaa ja viisi opettajaa testasivat järjestelmää ja keskustelivat sen taustalla olevasta konseptista. Tulokset osoittavat, että osallistujat kokivat, että CST lisäisi joustavuutta ja mielekkyyttä viestintäopinnoissa. Tämän lisäksi osallistujat kokivat konseptin toteuttamiskelpoiseksi Turun yliopistossa sekä soveltuvan hyvin yliopiston koulutustavoitteisiin. Suurimmat haasteet liittyivät heikompien opiskelijoiden tukemiseen ja ryhmätyön koordinointiin. Opettajatestaajat ilmaisivat myös huolensa siitä, miten yliopiston hallinto toteuttaisi kurssittoman järjestelmän, kantaen erityisen huolen resurssien kohdentamisesta

    Exploring the Role of AI Assistants in Computer Science Education: Methods, Implications, and Instructor Perspectives

    Full text link
    The use of AI assistants, along with the challenges they present, has sparked significant debate within the community of computer science education. While these tools demonstrate the potential to support students' learning and instructors' teaching, they also raise concerns about enabling unethical uses by students. Previous research has suggested various strategies aimed at addressing these issues. However, they concentrate on the introductory programming courses and focus on one specific type of problem. The present research evaluated the performance of ChatGPT, a state-of-the-art AI assistant, at solving 187 problems spanning three distinct types that were collected from six undergraduate computer science. The selected courses covered different topics and targeted different program levels. We then explored methods to modify these problems to adapt them to ChatGPT's capabilities to reduce potential misuse by students. Finally, we conducted semi-structured interviews with 11 computer science instructors. The aim was to gather their opinions on our problem modification methods, understand their perspectives on the impact of AI assistants on computer science education, and learn their strategies for adapting their courses to leverage these AI capabilities for educational improvement. The results revealed issues ranging from academic fairness to long-term impact on students' mental models. From our results, we derived design implications and recommended tools to help instructors design and create future course material that could more effectively adapt to AI assistants' capabilities

    Harnessing customizationinWeb Annotation: ASoftwareProduct Line approach

    Get PDF
    222 p.La anotación web ayuda a mediar la interacción de lectura y escritura al transmitir información, agregar comentarios e inspirar conversaciones en documentos web. Se utiliza en áreas de Ciencias Sociales y Humanidades, Investigación Periodística, Ciencias Biológicas o Educación, por mencionar algunas. Las actividades de anotación son heterogéneas, donde los usuarios finales (estudiantes, periodistas, conservadores de datos, investigadores, etc.) tienen requisitos muy diferentes para crear, modificar y reutilizar anotaciones. Esto resulta en una gran cantidad de herramientas de anotación web y diferentes formas de representar y almacenar anotaciones web. Para facilitar la reutilización y la interoperabilidad, se han realizado varios intentos durante las últimas décadas para estandarizar las anotaciones web (por ejemplo, Annotea u Open Annotation), lo que ha dado como resultado las recomendaciones de anotaciones del W3C publicadas en 2017. Las recomendaciones del W3C proporcionan un marco para la representación de anotaciones (modelo de datos y vocabulario) y transporte (protocolo). Sin embargo, todavía hay una brecha en cómo se desarrollan los clientes de anotación (herramientas e interfaces de usuario), lo que hace que los desarrolladores vuelvan a re-implementar funcionalidades comunes (esdecir, resaltar, comentar, almacenar,¿) para crear su herramienta de anotación personalizada.Esta tesis tiene como objetivo proporcionar una plataforma de reutilización para el desarrollo de herramientas de anotación web para la revisión. Con este fin, hemos desarrollado una línea de productos de software llamada WACline. WACline es una familia de productos de anotación que permite a los desarrolladores crear extensiones de navegador de anotación web personalizadas, lo que facilita la reutilización de los activos principales y su adaptación a su contexto de revisión específico. Se ha creado siguiendo un proceso de acumulación de conocimientos en el que cada producto de anotación aprende de los productos de anotación creados previamente. Finalmente, llegamos a una familia de clientes de anotación que brinda soporte para tres prácticas de revisión: extracción de datos de revisión sistemática de literatura (Highlight&Go), revisión de tareas de estudiantes en educación superior (Mark&Go), y revisión por pares de conferencias y revistas (Review&Go). Para cada uno de los contextos de revisión, se ha llevado a cabo una evaluación con partes interesadas reales para validar las mejoras de eficiencia y eficacia aportadas por las herramientas de anotación personalizadas en su práctica

    ALT-C 2010 - Conference Proceedings

    Get PDF

    ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing

    Full text link
    Given the rapid ascent of large language models (LLMs), we study the question: (How) can large language models help in reviewing of scientific papers or proposals? We first conduct some pilot studies where we find that (i) GPT-4 outperforms other LLMs (Bard, Vicuna, Koala, Alpaca, LLaMa, Dolly, OpenAssistant, StableLM), and (ii) prompting with a specific question (e.g., to identify errors) outperforms prompting to simply write a review. With these insights, we study the use of LLMs (specifically, GPT-4) for three tasks: 1. Identifying errors: We construct 13 short computer science papers each with a deliberately inserted error, and ask the LLM to check for the correctness of these papers. We observe that the LLM finds errors in 7 of them, spanning both mathematical and conceptual errors. 2. Verifying checklists: We task the LLM to verify 16 closed-ended checklist questions in the respective sections of 15 NeurIPS 2022 papers. We find that across 119 {checklist question, paper} pairs, the LLM had an 86.6% accuracy. 3. Choosing the "better" paper: We generate 10 pairs of abstracts, deliberately designing each pair in such a way that one abstract was clearly superior than the other. The LLM, however, struggled to discern these relatively straightforward distinctions accurately, committing errors in its evaluations for 6 out of the 10 pairs. Based on these experiments, we think that LLMs have a promising use as reviewing assistants for specific reviewing tasks, but not (yet) for complete evaluations of papers or proposals
    corecore