128 research outputs found

    Normative Data of Dutch Idiomatic Expressions: Subjective Judgments You Can Bank on

    Get PDF
    The processing of idiomatic expressions is a topical issue in empirical research. Various factors have been found to influence idiom processing, such as idiom familiarity and idiom transparency. Information on these variables is usually obtained through norming studies. Studies investigating the effect of various properties on idiom processing have led to ambiguous results. This may be due to the variability of operationalizations of the idiom properties across norming studies, which in turn may affect the reliability of the subjective judgements. However, not all studies that collected normative data on idiomatic expressions investigated their reliability, and studies that did address the reliability of subjective ratings used various measures and produced mixed results. In this study, we investigated the reliability of subjective judgements, the relation between subjective and objective idiom frequency, and the impact of these dimensions on the participants’ idiom knowledge by collecting normative data of five subjective idiom properties (Frequency of Exposure, Meaning Familiarity, Frequency of Usage, Transparency, and Imageability) from 390 native speakers and objective corpus frequency for 374 Dutch idiomatic expressions. For reliability, we compared measures calculated in previous studies, with the D-coefficient, a metric taken from Generalizability Theory. High reliability was found for all subjective dimensions. One reliability metric, Krippendorff’s alpha, generally produced lower values, while similar values were obtained for three other measures (Cronbach’s alpha, Intraclass Correlation Coefficient, and the D-coefficient). Advantages of the D-coefficient are that it can be applied to unbalanced research designs, and to estimate the minimum number of raters required to obtain reliable ratings. Slightly higher coefficients were observed for so-called experience-based dimensions (Frequency of Exposure, Meaning Familiarity, and Frequency of Usage) than for content-based dimensions (Transparency and Imageability). In addition, fewer raters were required to obtain reliable ratings for the experience-based dimensions. Subjective and objective frequency appeared to be poorly correlated, while all subjective idiom properties and objective frequency turned out to affect idiom knowledge. Meaning Familiarity, Subjective and Objective Frequency of Exposure, Frequency of Usage, and Transparency positively contributed to idiom knowledge, while a negative effect was found for Imageability. We discuss these relationships in more detail, and give methodological recommendations with respect to the procedures and the measure to calculate reliability

    Automatic Assessment of Oral Reading Accuracy for Reading Diagnostics

    Full text link
    Automatic assessment of reading fluency using automatic speech recognition (ASR) holds great potential for early detection of reading difficulties and subsequent timely intervention. Precise assessment tools are required, especially for languages other than English. In this study, we evaluate six state-of-the-art ASR-based systems for automatically assessing Dutch oral reading accuracy using Kaldi and Whisper. Results show our most successful system reached substantial agreement with human evaluations (MCC = .63). The same system reached the highest correlation between forced decoding confidence scores and word correctness (r = .45). This system's language model (LM) consisted of manual orthographic transcriptions and reading prompts of the test data, which shows that including reading errors in the LM improves assessment performance. We discuss the implications for developing automatic assessment systems and identify possible avenues of future research

    Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications

    Full text link
    Voicebots have provided a new avenue for supporting the development of language skills, particularly within the context of second language learning. Voicebots, though, have largely been geared towards native adult speakers. We sought to assess the performance of two state-of-the-art ASR systems, Wav2Vec2.0 and Whisper AI, with a view to developing a voicebot that can support children acquiring a foreign language. We evaluated their performance on read and extemporaneous speech of native and non-native Dutch children. We also investigated the utility of using ASR technology to provide insight into the children's pronunciation and fluency. The results show that recent, pre-trained ASR transformer-based models achieve acceptable performance from which detailed feedback on phoneme pronunciation quality can be extracted, despite the challenging nature of child and non-native speech.Comment: Published on SLATE 2023, Esmad, Politecnico Do Porto, Portugal, 26-28 June, 2023, pp: 11:1-11:

    Two Automatic Approaches for Analyzing Connected Speech Processes in Dutch

    Get PDF
    This paper describes two automatic approaches used to study connected speech processes (CSPs) in Dutch. The first approach was from a linguistic point of view - the top-down method. This method can be used for verification of hypotheses about CSPs. The second approach - the bottom-up method -uses a constrained phone recognizer to generate phone transcriptions. An alignment was carried out between the two transcriptions and a reference transcription. A comparison between the two methods showed that 68% agreement was achieved on the CSPs. Although phone accuracy is only 63%, the bottom-up approach is useful for studying CSPs. From the data generated using the bottom-up method, indications of which CSPs are present in the material can be found. These indications can be used to generate hypotheses which can then be tested using the top-down method

    Comparison between expert listeners and continuous speech recognizers in selecting pronunciation variants.

    Get PDF
    In this paper, the performance of an automatic transcription tool is evaluated. The transcription tool is a continuous speech recognizer (CSR) which can be used to select pronunciation variants (i.e. detect insertions and deletions of phones). The performance of the CSR was compared to a reference transcription based on the judgments of expert listeners. We investigated to what extent the degree of agreement between the listeners and the CSR was affected by employing various sets of phone models (PMs). Overall, the PMs perform more similarly to the listeners when pronunciation variation is modeled. However, the various sets of PMs lead to different results for insertion and deletion processes. Furthermore, we found that to a certain degree, word error rates can be used to predict which set of PMs to use in the transcription tool

    Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses

    Full text link
    Alzheimer's Disease (AD) is the world's leading neurodegenerative disease, which often results in communication difficulties. Analysing speech can serve as a diagnostic tool for identifying the condition. The recent ADReSS challenge provided a dataset for AD classification and highlighted the utility of manual transcriptions. In this study, we used the new state-of-the-art Automatic Speech Recognition (ASR) model Whisper to obtain the transcriptions, which also include automatic punctuation. The classification models achieved test accuracy scores of 0.854 and 0.833 combining the pretrained FastText word embeddings and recurrent neural networks on manual and ASR transcripts respectively. Additionally, we explored the influence of including pause information and punctuation in the transcriptions. We found that punctuation only yielded minor improvements in some cases, whereas pause encoding aided AD classification for both manual and ASR transcriptions across all approaches investigated

    Directions for the future of technology in pronunciation research and teaching

    Get PDF
    This paper reports on the role of technology in state-of-the-art pronunciation research and instruction, and makes concrete suggestions for future developments. The point of departure for this contribution is that the goal of second language (L2) pronunciation research and teaching should be enhanced comprehensibility and intelligibility as opposed to native-likeness. Three main areas are covered here. We begin with a presentation of advanced uses of pronunciation technology in research with a special focus on the expertise required to carry out even small-scale investigations. Next, we discuss the nature of data in pronunciation research, pointing to ways in which future work can build on advances in corpus research and crowdsourcing. Finally, we consider how these insights pave the way for researchers and developers working to create research-informed, computer-assisted pronunciation teaching resources. We conclude with predictions for future developments
    • 

    corecore