427 research outputs found

    Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions

    Get PDF
    We address the task of automatically grading the language proficiency of spontaneous speech based on textual features from automatic speech recognition transcripts. Motivated by recent advances in multi-task learning, we develop neural networks trained in a multi-task fashion that learn to predict the proficiency level of non-native English speakers by taking advantage of inductive transfer between the main task (grading) and auxiliary prediction tasks: morpho-syntactic labeling, language modeling, and native language identification (L1). We encode the transcriptions with both bi-directional recurrent neural networks and with bi-directional representations from transformers, compare against a feature-rich baseline, and analyse performance at different proficiency levels and with transcriptions of varying error rates. Our best performance comes from a transformer encoder with L1 prediction as an auxiliary task. We discuss areas for improvement and potential applications for text-only speech scoring.Cambridge Assessmen

    Exploring learner perceptions of and interaction behaviors using the Research Writing Tutor for research article Introduction section draft analysis

    Get PDF
    The swiftly escalating popularity of automated writing evaluation (AWE) software in recent years has compelled much study into its potential for effective pedagogical use (Chen & Cheng, 2008; Cotos, 2011; Warschauer & Ware, 2006). Research on the effectiveness of AWE tools has concentrated primarily on determining learners\u27 achieved output (Warschauer & Ware, 2006) and emphasized the attainment of linguistic goals (Escudier et al., 2011); however, in-process investigations of users\u27 interactions with and perceptions of AWE tools remain sparse (Shute, 2008; Ware, 2011). This dissertation employed a mixed-methods approach to investigate how 11 graduate student language learners interacted with and perceived the Research Writing Tutor (RWT), a web-based AWE tool which provides discourse-oriented, discipline-specific feedback on users\u27 section drafts of empirical research papers. A variety of data was collected and analyzed to capture a multidimensional depiction of learners\u27 first time interactions with the RWT; data comprised learners\u27 pre-task demographic survey responses, screen recordings of students\u27 interactions with the RWT, individual users\u27 interactional reports archived in the RWT database, instructor and researcher observations of students\u27 in-class RWT interactions, stimulated recall transcripts, and post-task survey responses. Descriptive statistics of the Likert-scale response data were calculated, and open-ended survey responses and stimulated recall transcripts were analyzed using open coding discourse analysis techniques or Systemic Functional Linguistic (SFL) appreciation resource analysis (Martin & Rose, 2003), prior to triangulating data for certain research questions. Results showed that participants found the RWT to be useful and were positive in their attitudes about helpfulness of the tool in the future if issues in feedback accuracy were improved. However, the participants\u27 also cited wavering trust in the RWT and its automated feedback, seemingly originating from learners\u27 observations of RWT feedback inaccuracies. Systematized observations of learners\u27 actual and reported RWT interaction behaviors showed both unique and patterned behaviors and strategies for using the RWT for draft revision. The participants\u27 cited learner variables, such as technological background and comfort levels using computers, personality, status as a non-native speaker of English, discipline of study, and preferences for certain forms of feedback, as impacting their experience with the RWT. Findings from this research may help enlighten potential pedagogical uses of AWE programs in the university writing classroom as well as help inform the design of AWE tasks and tools to facilitate individualized learning experiences for enhanced writing development

    Oral language accuracy, corrective feedback and learner uptake in an online EFL course

    Get PDF
    Les eines tecnològiques han ampliat el ventall de possibilitats en l'ensenyament i l'aprenentatge de llengües i han generat moltes preguntes en professors i investigadors: quina és la millor manera d'integrar la tecnologia? Quins efectes té la tecnologia en l'aprenentatge? Quin és el paper de la correcció d'errors en ambients totalment virtuals? Aquesta investigació busca identificar els errors més comuns, les estratègies de correcció més freqüents i les reaccions dels alumnes a les correccions dels professors en ambients sincrònics. Aquest és un estudi de cas qualitatiu que fa servir mètodes mixtos i l'anàlisi del discurs mitjançat per ordinador per analitzar la informació. Els resultats de la investigació mostren que els estudiants produeixen errors a velocitats similars, el professor tendeix a proveir la correcció explícita dels errors i els estudiants tendeixen a repetir les correccions del professor.Las herramientas tecnológicas han ampliado el abanico de posibilidades en la enseñanza y el aprendizaje de lenguas y han generado muchas preguntas en profesores e investigadores: ¿cuál es la mejor manera de integrar la tecnología?, ¿qué efectos tiene la tecnología en el aprendizaje?, ¿cuál es el rol de la corrección de errores en ambientes totalmente virtuales? Esta investigación busca identificar los errores más comunes, las estrategias de corrección más frecuentes y las reacciones de los aprendices a las correcciones de los profesores en ambientes sincrónicos. Este es un estudio de caso cualitativo que usa métodos mixtos y el análisis del discurso mediado por computador para analizar la información. Los resultados de la investigación muestran que los estudiantes producen errores a velocidades similares, el profesor tiende a proveer la corrección explícita de los errores y los estudiantes tienden a repetir la corrección del profesor.The use of computer-mediated communication (CMC) technologies has broadened the scope of possibilities for language teaching and learning, while also leading teachers and researchers alike to pose a number of relevant questions. What is the best way to blend such technologies into teaching? What impact will CMC technologies have on learners' target language development? What role does teacher feedback play in exclusively online language learning settings? To answer these questions, a qualitative case study was carried out to identify the most common errors made by language learners, the correction strategies employed by teachers and, finally, learners' reactions to these corrections in synchronous interactions. The study's main findings, based on a mixed-methods and computer-mediated discourse analysis approach, are as follows: most learners make mistakes at a similar rate, the number of mistakes drops towards the end of the course, the teacher tends to provide explicit corrective feedback and students' main strategy for amending their mistakes is to repeat the teacher's correction

    Oral language accuracy, corrective feedback and learner uptake in an online EFL course

    Get PDF
    Les eines tecnològiques han ampliat el ventall de possibilitats en l'ensenyament i l'aprenentatge de llengües i han generat moltes preguntes en professors i investigadors: quina és la millor manera d'integrar la tecnologia? Quins efectes té la tecnologia en l'aprenentatge? Quin és el paper de la correcció d'errors en ambients totalment virtuals? Aquesta investigació busca identificar els errors més comuns, les estratègies de correcció més freqüents i les reaccions dels alumnes a les correccions dels professors en ambients sincrònics. Aquest és un estudi de cas qualitatiu que fa servir mètodes mixtos i l'anàlisi del discurs mitjançat per ordinador per analitzar la informació. Els resultats de la investigació mostren que els estudiants produeixen errors a velocitats similars, el professor tendeix a proveir la correcció explícita dels errors i els estudiants tendeixen a repetir les correccions del professor.Las herramientas tecnológicas han ampliado el abanico de posibilidades en la enseñanza y el aprendizaje de lenguas y han generado muchas preguntas en profesores e investigadores: ¿cuál es la mejor manera de integrar la tecnología?, ¿qué efectos tiene la tecnología en el aprendizaje?, ¿cuál es el rol de la corrección de errores en ambientes totalmente virtuales? Esta investigación busca identificar los errores más comunes, las estrategias de corrección más frecuentes y las reacciones de los aprendices a las correcciones de los profesores en ambientes sincrónicos. Este es un estudio de caso cualitativo que usa métodos mixtos y el análisis del discurso mediado por computador para analizar la información. Los resultados de la investigación muestran que los estudiantes producen errores a velocidades similares, el profesor tiende a proveer la corrección explícita de los errores y los estudiantes tienden a repetir la corrección del profesor.The use of computer-mediated communication (CMC) technologies has broadened the scope of possibilities for language teaching and learning, while also leading teachers and researchers alike to pose a number of relevant questions. What is the best way to blend such technologies into teaching? What impact will CMC technologies have on learners' target language development? What role does teacher feedback play in exclusively online language learning settings? To answer these questions, a qualitative case study was carried out to identify the most common errors made by language learners, the correction strategies employed by teachers and, finally, learners' reactions to these corrections in synchronous interactions. The study's main findings, based on a mixed-methods and computer-mediated discourse analysis approach, are as follows: most learners make mistakes at a similar rate, the number of mistakes drops towards the end of the course, the teacher tends to provide explicit corrective feedback and students' main strategy for amending their mistakes is to repeat the teacher's correction

    Essential Speech and Language Technology for Dutch: Results by the STEVIN-programme

    Get PDF
    Computational Linguistics; Germanic Languages; Artificial Intelligence (incl. Robotics); Computing Methodologie

    Modeling statistics ITAs’ speaking performances in a certification test

    Get PDF
    In light of the ever-increasing capability of computer technology and advancement in speech and natural language processing techniques, automated speech scoring of constructed responses is gaining popularity in many high-stakes assessment and low-stakes educational settings. Automated scoring is a highly interdisciplinary and complex subject, and there is much unknown about the strengths and weaknesses of automated speech scoring systems (Evanini & Zechner, 2020). Research in automated speech scoring has been centralized around a few proprietary systems owned by large testing companies. Consequently, existing systems only serve large-scale standardized assessment purposes. Application of automated scoring technologies in local assessment contexts is much desired but rarely realized because the system’s inner workings have remained unfamiliar to many language assessment professionals. Moreover, assumptions about the reliability of human scores, on which automated scoring systems are trained, are untenable in many local assessment situations, where a myriad of factors would work together to co-determine the human scores. These factors may include the rating design, the test takers’ abilities, and the raters’ specific rating behaviors (e.g., severity/leniency, internal consistency, and application of the rating scale). In an attempt to apply automated scoring procedures to a local context, the primary purpose of this study is to develop and evaluate an appropriate automated speech scoring model for a local certification test of international teaching assistants (ITAs). To meet this goal, this study first implemented feature extraction and selection based on existing automated speech scoring technologies and the scoring rubric of the local speaking test. Then, the reliability of the human ratings was investigated based on both Classical Test Theory (CTT) and Item Response Theory (IRT) frameworks, focusing on detecting potential rater effects that could negatively impact the quality of the human scores. Finally, by experimenting and comparing a series of statistical modeling options, this study investigated the extent to which the association between the automatically extracted features and the human scores could be statistically modeled to offer a mechanism that reflects the multifaceted nature of the performance assessment in a unified statistical framework. The extensive search for the speech or linguistic features, covering the sub-domains of fluency, pronunciation, rhythm, vocabulary, grammar, content, and discourse cohesion, revealed that a small set of useful variables could be identified. A large number of features could be effectively summarized as single latent factors that showed reasonably high associations with the human scores. Reliability analysis of human scoring indicated that both inter-rater reliability and intra-rater reliability were acceptable, and through a fine-grained IRT analysis, several raters who were prone to the central tendency or randomness effects were identified. Model fit indices, model performance in prediction, and model diagnostics results in the statistical modeling indicated that the most appropriate approach to model the relationship between the features and the final human scores was a cumulative link model (CLM). In contrast, the most appropriate approach to model the relationship between the features and the ratings from the multiple raters was a cumulative link mixed model (CLMM). These models suggested that higher ability levels were significantly related to the lapse of time, faster speech with fewer disfluencies, more varied and sophisticated vocabulary, more complex syntactic structures, and fewer rater effects. Based on the model’s prediction on unseen data, the rating-level CLMM achieved an accuracy of 0.64, a Pearson correlation of 0.58, and a quadratically-weighted kappa of 0.57, as compared to the human ratings on the 3-point scale. Results from this study could be used to inform the development, design, and implementation for a prototypical automated scoring system for prospective ITAs, as well as providing empirical evidence for future scale development, rater training, and support for assessment-related instruction for the testing program and diagnostic feedback for the ITA test takers

    Novel Datasets, User Interfaces and Learner Models to Improve Learner Engagement Prediction on Educational Videos

    Get PDF
    With the emergence of Open Education Resources (OERs), educational content creation has rapidly scaled up, making a large collection of new materials made available. Among these, we find educational videos, the most popular modality for transferring knowledge in the technology-enhanced learning paradigm. Rapid creation of learning resources opens up opportunities in facilitating sustainable education, as the potential to personalise and recommend specific materials that align with individual users’ interests, goals, knowledge level, language and stylistic preferences increases. However, the quality and topical coverage of these materials could vary significantly, posing significant challenges in managing this large collection, including the risk of negative user experience and engagement with these materials. The scarcity of support resources such as public datasets is another challenge that slows down the development of tools in this research area. This thesis develops a set of novel tools that improve the recommendation of educational videos. Two novel datasets and an e-learning platform with a novel user interface are developed to support the offline and online testing of recommendation models for educational videos. Furthermore, a set of learner models that accounts for the learner interests, knowledge, novelty and popularity of content is developed through this thesis. The different models are integrated together to propose a novel learner model that accounts for the different factors simultaneously. The user studies conducted on the novel user interface show that the new interface encourages users to explore the topical content more rigorously before making relevance judgements about educational videos. Offline experiments on the newly constructed datasets show that the newly proposed learner models outperform their relevant baselines significantly
    • …
    corecore