11 research outputs found

    Feedback on teachers' text assessment: Does it foster assessment accuracy and motivation?

    Get PDF
    Teachers' assessment of students' performance on complex tasks, such as writing, is important both for their teaching and for students' learning. Teachers must be able and motivated to assess texts correctly. According to theoretical assumptions, feedback can help promote the diagnostic competencies required to assess texts correctly, but, up until now, no empirical studies have examined the effects of accuracy feedback on teachers' assessments. We conducted an experimental study comparing the effects of two feedback interventions with a practice-only control group on teachers' assessment accuracy and motivation. Student teachers (n = 181) and experienced teachers (n = 114) assessed 10 students' texts in all groups. The feedback in both of the feedback groups showed the teachers a comparison between their own assessments and correct assessments. We varied the feedback presentation between one single presentation after five texts and single presentations after each of the first five texts. We measured assessment accuracy and situational interest, which conceptualizes motivation, to assess the next five texts. The results showed that feedback promoted situational interest but not assessment accuracy. We discuss why teachers found feedback interesting and under what circumstances training interventions could be useful.Die Beurteilungen von LehrkrĂ€ften zu den schriftlichen Leistungen ihrer SchĂŒlerinnen und SchĂŒler spielen eine wichtige Rolle fĂŒr die Unterrichtsgestaltung und das Lernen. Daher sollen fĂŒr die Aus- und Weiterbildung von LehrkrĂ€ften Trainingsmöglichkeiten geschaffen werden, welche die Kompetenz und Motivation der LehrkrĂ€fte zur korrekten Beurteilung erhöhen. Es ist anzunehmen, dass Feedback diagnostische Kompetenzen fördern kann, aber momentan fehlt es an empirischen Studien, die die EffektivitĂ€t von Feedback im Vergleich zu BeurteilungsĂŒbungen ohne Feedback untersuchen. Der vorliegende Artikel stellt eine experimentelle Studie vor, in der die Effekte von zwei Feedback-Interventionen auf die Beurteilungsgenauigkeit und Motivation der LehrkrĂ€fte mit einer Kontrollgruppe, in der Texte ohne Feedback beurteilt wurden, verglichen wurden. Lehramtsstudierende (n = 181) und erfahrene LehrkrĂ€fte (n = 114) bewerteten in allen Gruppen zehn SchĂŒlertexte. In beiden Feedbackbedingungen wurde den LehrkrĂ€ften ein Vergleich der eigenen Bewertung mit der Bewertung des Textes durch Expert:innen gezeigt. Zwischen den Gruppen variierte die Feedback-PrĂ€sentation zwischen einer einmaligen PrĂ€sentation nach fĂŒnf Texten und mehrmaliger PrĂ€sentation nach jedem der ersten fĂŒnf Texte. Danach wurde die Motivation weitere Texte zu beurteilen, sowie die Beurteilungsgenauigkeit bei der Beurteilung der zweiten fĂŒnf Texte gemessen. Die Ergebnisse zeigten, dass beide Feedbackbedingungen im Vergleich zur Kontrollgruppe das situative Interesse an der Beurteilung förderten, aber nicht die Genauigkeit. Es wird diskutiert, warum LehrkrĂ€fte die Beurteilung mit Feedback interessanter fanden und unter welchen UmstĂ€nden Trainingsinterventionen nĂŒtzlich sein können.Peer Reviewe

    Is a Long Essay Always a Good Essay? The Effect of Text Length on Writing Assessment

    Get PDF
    The assessment of text quality is a transdisciplinary issue concerning the research areas of educational assessment, language technology, and classroom instruction. Text length has been found to strongly influence human judgment of text quality. The question of whether text length is a construct-relevant aspect of writing competence or a source of judgment bias has been discussed controversially. This paper used both a correlational and an experimental approach to investigate this question. Secondary analyses were performed on a large-scale dataset with highly trained raters, showing an effect of text length beyond language proficiency. Furthermore, an experimental study found that pre-service teachers tended to undervalue text length when compared to professional ratings. The findings are discussed with respect to the role of training and context in writing assessment

    Don't Just Judge the Spelling! The Influence of Spelling on Assessing Second-Language Student Essays

    Get PDF
    When judging subject-specific aspects of students’ texts, teachers should assess various characteristics, e.g., spelling and content, independently of one another since these characteristics are indicators of different skills. Independent judgments enable teachers to adapt their classroom instruction according to students’ skills. It is still unclear how well teachers meet this challenge and which intervention could be helpful to them. In Study 1, N = 51 pre-service teachers assessed four authentic English as a Second Language (ESL) essays with different overall text qualities and different qualities of spelling using holistic and analytic rating scales. Results showed a negative influence of the experimentally manipulated spelling errors on the judgment of almost all textual characteristics. In Study 2, an experimental prompt was used to reduce this judgment error. Participants who were made aware of the judgment error caused by spelling errors formed their judgments in a less biased way, indicating a reduction of bias. The determinants of the observed effects and their practical implications are discussed

    Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays

    No full text
    The potential application of generative artificial intelligence (AI) in schools and universities poses great challenges, especially for the assessment of students’ texts. Previous research has shown that people generally have difficulty distinguishing AI-generated from human-written texts; however, the ability of teachers to identify an AI-generated text among student essays has not yet been investigated. Here we show in two experimental studies that novice (N = 89) and experienced teachers (N = 200) could not identify texts generated by ChatGPT among student-written texts. However, there are some indications that more experienced teachers made more differentiated and more accurate judgments. Furthermore, both groups were overconfident in their judgments. Effects of real and assumed source on quality assessment were heterogeneous. Our findings demonstrate that with relatively little prompting, current AI can generate texts that are not detectable for teachers, which poses a challenge to schools and universities in grading student essays. Our study provides empirical evidence for the current debate regarding exam strategies in schools and universities in light of the latest technological developments

    DARIUS: A Comprehensive Learner Corpus for Argument Mining in German-Language Essays

    No full text
    In this paper, we present the DARIUS (Digital Argumentation Instruction for Science) corpus for argumentation quality on 4589 essays written by 1839 German secondary school students. The corpus is annotated according to a fine-grained annotation scheme, ranging from a broader perspective like content zones, to more granular features like argumentation coverage/reach and argumentative discourse units like claims and warrants. The features have inter-annotator agreements up to 0.83 Krippendorff’s α. The corpus and dataset are publicly available for further research in argument mining

    EUDAQ—a data acquisition software framework for common beam telescopes

    No full text
    EUDAQ is a generic data acquisition software developed for use in conjunction with common beam telescopes at charged particle beam lines. Providing high-precision reference tracks for performance studies of new sensors, beam telescopes are essential for the research and development towards future detectors for high-energy physics. As beam time is a highly limited resource, EUDAQ has been designed with reliability and ease-of-use in mind. It enables flexible integration of different independent devices under test via their specific data acquisition systems into a top-level framework. EUDAQ controls all components globally, handles the data flow centrally and synchronises and records the data streams. Over the past decade, EUDAQ has been deployed as part of a wide range of successful test beam campaigns and detector development applications

    The impact of frailty on ICU and 30-day mortality and the level of care in very elderly patients (≄ 80 years)

    No full text
    Purpose: Very old critical ill patients are a rapid expanding group in the ICU. Indications for admission, triage criteria and level of care are frequently discussed for such patients. However, most relevant outcome studies in this group frequently find an increased mortality and a reduced quality of life in survivors. The main objective was to study the impact of frailty compared with other variables with regards to short-term outcome in the very old ICU population. Methods: A transnational prospective cohort study from October 2016 to May 2017 with 30 days follow-up was set up by the European Society of Intensive Care Medicine. In total 311 ICUs from 21 European countries participated. The ICUs included the first consecutive 20 very old (≄ 80 years) patients admitted to the ICU within a 3-month inclusion period. Frailty, SOFA score and therapeutic procedures were registered, in addition to limitations of care. For measurement of frailty the Clinical Frailty Scale was used at ICU admission. The main outcomes were ICU and 30-day mortality and survival at 30 days. Results: A total of 5021 patients with a median age of 84 years (IQR 81–86 years) were included in the final analysis, 2404 (47.9%) were women. Admission was classified as acute in 4215 (83.9%) of the patients. Overall ICU and 30-day mortality rates were 22.1% and 32.6%. During ICU stay 23.8% of the patients did not receive specific ICU procedures: ventilation, vasoactive drugs or renal replacement therapy. Frailty (values ≄ 5) was found in 43.1% and was independently related to 30-day survival (HR 1.54; 95% CI 1.38–1.73) for frail versus non-frail. Conclusions: Among very old patients (≄ 80 years) admitted to the ICU, the consecutive classes in Clinical Frailty Scale were inversely associated with short-term survival. The scale had a very low number of missing data. These findings provide support to add frailty to the clinical assessment in this patient group. Trial registration: ClinicalTrials.gov (ID: NCT03134807)

    Sepsis at ICU admission does not decrease 30-day survival in very old patients: a post-hoc analysis of the VIP1 multinational cohort study

    No full text
    Background: The number of intensive care patients aged ≄ 80 years (Very old Intensive Care Patients; VIPs) is growing. VIPs have high mortality and morbidity and the benefits of ICU admission are frequently questioned. Sepsis incidence has risen in recent years and identification of outcomes is of considerable public importance. We aimed to determine whether VIPs admitted for sepsis had different outcomes than those admitted for other acute reasons and identify potential prognostic factors for 30-day survival. Results: This prospective study included VIPs with Sequential Organ Failure Assessment (SOFA) scores ≄ 2 acutely admitted to 307 ICUs in 21 European countries. Of 3869 acutely admitted VIPs, 493 (12.7%) [53.8% male, median age 83 (81-86) years] were admitted for sepsis. Sepsis was defined according to clinical criteria; suspected or demonstrated focus of infection and SOFA score ≄ 2 points. Compared to VIPs admitted for other acute reasons, VIPs admitted for sepsis were younger, had a higher SOFA score (9 vs. 7, p < 0.0001), required more vasoactive drugs [82.2% vs. 55.1%, p < 0.0001] and renal replacement therapies [17.4% vs. 9.9%; p < 0.0001], and had more life-sustaining treatment limitations [37.3% vs. 32.1%; p = 0.02]. Frailty was similar in both groups. Unadjusted 30-day survival was not significantly different between the two groups. After adjustment for age, gender, frailty, and SOFA score, sepsis had no impact on 30-day survival [HR 0.99 (95% CI 0.86-1.15), p = 0.917]. Inverse-probability weight (IPW)-adjusted survival curves for the first 30 days after ICU admission were similar for acute septic and non-septic patients [HR: 1.00 (95% CI 0.87-1.17), p = 0.95]. A matched-pair analysis in which patients with sepsis were matched with two control patients of the same gender with the same age, SOFA score, and level of frailty was also performed. A Cox proportional hazard regression model stratified on the matched pairs showed that 30-day survival was similar in both groups [57.2% (95% CI 52.7-60.7) vs. 57.1% (95% CI 53.7-60.1), p = 0.85]. Conclusions: After adjusting for organ dysfunction, sepsis at admission was not independently associated with decreased 30-day survival in this multinational study of 3869 VIPs. Age, frailty, and SOFA score were independently associated with survival
    corecore