4 research outputs found

    Rytmické rozdíly mezi velškou angličtinou a britským standardem

    Get PDF
    Tato práce se zabývá rytmickými rozdíly mezi velšskou angličtinou a britským standardem. Úžeji se specializuje na varianty měst Cardiff a Aberystwyth. Teoretická kapitola v první části shrnuje přístupy k rytmu z hlediska fyziologického, akustického, percepčního a fonologického. Druhá část poskytuje stručný popis britského standardu, velštiny a velšských angličtin a věnuje se dosavadním poznatkům týkající se dané problematiky obzvláště ve velšských nářečích angličtiny. Poslední, třetí část, slouží jako přehled základních přístupů k hledání akustických korelátů rytmu (%V, ∆C, ∆V, PVI, varco, RR, YARD). Zbylé kapitoly práce se věnují materiálové studii, pro niž byly obstarány příslušné nahrávky. Segmentace byla provedena na bázi segmentačních pravidel navržených Machačem a Skarnitzlem 2009. Pro velšské mluvčí byl rytmus zkoumán na čtyřech respondentech pro každou z vybraných oblastí Walesu ve věkovém rozpětí 35-39 let pro skupinu z Cardiffu a 29-39 pro skupinu z Aberystwyth. Výsledky byly porovnány s výzkumem Volína a Polláka z roku 2009, který mimo jiné na bázi stejného textu, který byl využit při sběru dat pro výzkum této diplomové práce, poskytl výsledky rytmických hodnot dle ukazatelů %V a ∆C pro britský standard. Ukázalo se, že velšské angličtiny osmi respondentů mají obdobné hodnoty pro ∆C a %V...The present thesis deals with rhythmic differences between Welsh English and the British Standard. It focuses on the varieties spoken in Cardiff and Aberystwyth in particular. The first part of the theoretical chapter summarises the approaches towards rhythm from the physiological, acoustic, perceptual, and phonological perspectives. The second part provides a basic description of the British Standard, Welsh, and Welsh English. It concerns itself with the existing information related to the subject matter especially as regards Welsh varieties of English. The last, third part, serves as an overview of the most common approaches towards the search of the acoustic correlates of rhythm (%V, ∆C, ∆V, PVI, varco, RR, YARD). The following chapters of the thesis present a material based study of the data obtained for the purposes of the thesis. The segmentation was carried out according to the principles proposed by Machač and Skarnitzl 2009. Rhythm was measured for four respondents for each selected location of Wales. The age span was 35-39 years for the group from Cardiff and 29-39 for that from Aberystwyth. The values measured were compared with the research of Volín and Pollák from 2009, which, among other things, provided the results of the rhythmic values for %V and ∆C for the British Standard on the...Department of the English Language and ELT MethodologyÚstav anglického jazyka a didaktikyFaculty of ArtsFilozofická fakult

    Modeling statistics ITAs’ speaking performances in a certification test

    Get PDF
    In light of the ever-increasing capability of computer technology and advancement in speech and natural language processing techniques, automated speech scoring of constructed responses is gaining popularity in many high-stakes assessment and low-stakes educational settings. Automated scoring is a highly interdisciplinary and complex subject, and there is much unknown about the strengths and weaknesses of automated speech scoring systems (Evanini & Zechner, 2020). Research in automated speech scoring has been centralized around a few proprietary systems owned by large testing companies. Consequently, existing systems only serve large-scale standardized assessment purposes. Application of automated scoring technologies in local assessment contexts is much desired but rarely realized because the system’s inner workings have remained unfamiliar to many language assessment professionals. Moreover, assumptions about the reliability of human scores, on which automated scoring systems are trained, are untenable in many local assessment situations, where a myriad of factors would work together to co-determine the human scores. These factors may include the rating design, the test takers’ abilities, and the raters’ specific rating behaviors (e.g., severity/leniency, internal consistency, and application of the rating scale). In an attempt to apply automated scoring procedures to a local context, the primary purpose of this study is to develop and evaluate an appropriate automated speech scoring model for a local certification test of international teaching assistants (ITAs). To meet this goal, this study first implemented feature extraction and selection based on existing automated speech scoring technologies and the scoring rubric of the local speaking test. Then, the reliability of the human ratings was investigated based on both Classical Test Theory (CTT) and Item Response Theory (IRT) frameworks, focusing on detecting potential rater effects that could negatively impact the quality of the human scores. Finally, by experimenting and comparing a series of statistical modeling options, this study investigated the extent to which the association between the automatically extracted features and the human scores could be statistically modeled to offer a mechanism that reflects the multifaceted nature of the performance assessment in a unified statistical framework. The extensive search for the speech or linguistic features, covering the sub-domains of fluency, pronunciation, rhythm, vocabulary, grammar, content, and discourse cohesion, revealed that a small set of useful variables could be identified. A large number of features could be effectively summarized as single latent factors that showed reasonably high associations with the human scores. Reliability analysis of human scoring indicated that both inter-rater reliability and intra-rater reliability were acceptable, and through a fine-grained IRT analysis, several raters who were prone to the central tendency or randomness effects were identified. Model fit indices, model performance in prediction, and model diagnostics results in the statistical modeling indicated that the most appropriate approach to model the relationship between the features and the final human scores was a cumulative link model (CLM). In contrast, the most appropriate approach to model the relationship between the features and the ratings from the multiple raters was a cumulative link mixed model (CLMM). These models suggested that higher ability levels were significantly related to the lapse of time, faster speech with fewer disfluencies, more varied and sophisticated vocabulary, more complex syntactic structures, and fewer rater effects. Based on the model’s prediction on unseen data, the rating-level CLMM achieved an accuracy of 0.64, a Pearson correlation of 0.58, and a quadratically-weighted kappa of 0.57, as compared to the human ratings on the 3-point scale. Results from this study could be used to inform the development, design, and implementation for a prototypical automated scoring system for prospective ITAs, as well as providing empirical evidence for future scale development, rater training, and support for assessment-related instruction for the testing program and diagnostic feedback for the ITA test takers
    corecore