22 research outputs found

    Do face masks introduce bias in speech technologies? The case of automated scoring of speaking proficiency

    Full text link
    The COVID-19 pandemic has led to a dramatic increase in the use of face masks worldwide. Face coverings can affect both acoustic properties of the signal as well as speech patterns and have unintended effects if the person wearing the mask attempts to use speech processing technologies. In this paper we explore the impact of wearing face masks on the automated assessment of English language proficiency. We use a dataset from a large-scale speaking test for which test-takers were required to wear face masks during the test administration, and we compare it to a matched control sample of test-takers who took the same test before the mask requirements were put in place. We find that the two samples differ across a range of acoustic measures and also show a small but significant difference in speech patterns. However, these differences do not lead to differences in human or automated scores of English language proficiency. Several measures of bias showed no differences in scores between the two groups

    Patterns of durational variation in British dialects

    No full text
    We present the results of a quantitative study of durational variability in several British dialects. The study is based on the IViE corpus (Grabe 2001), which contains recordings of speakers from seven central urban dialects of English from the UK and Ireland. The corpus used in this study included speakers from Belfast, Cambridge, Newcastle, Leeds, Dublin, monolingual London residents of Jamaican decent and bilingual Bradford residents of Punjabi decent. We used reading and re-telling of "Cinderella" story by 6 female and 6 male speakers of each dialect (7 female and 5 male for London). The read version of Cinderella was split into 5 parts. The total total number of sound files used for analysis was 465. The corpus was automatically segmented into vocalic and consonantal intervals. We then computed 15 published measures of durational variability (for example, %V or VnPVI). To compare the variation between the dialects with variation within the dialects, we used classifier algorithms based on all possible combinations of 1, 2 or 3 measures (more than 2000 in total). We found that there is a substantial overlap between the dialects leading to a relatively low classification rate: based on various combinations of the rhythm measures, the classifiers could correctly identify the dialect of not more than 30% of data (chance=15%). We see an increase in performance of the classifiers based on greater number of rhythm measures. This shows that our previous finding that rhythm is a multidimensional phenomenon (Loukina et al. 2009) is true not only for different languages, but also for different dialects of the same language. Our classifiers showed higher identification rates for Belfast English. These results agree with the results previously reported by Ferragne (2004, 2008) who conducted a similar study on a different corpus of British dialects: they found that rhythm measures computed on Ulster English were different from other accents in their corpus. Contrary to their result, we find little overlap between Belfast and Dublin English. We also found better classification rates for the two 'ethnic minorities' accents: Punjabi and Jamaican The patterns we found based on measures of durational variability are partially similar to what has been reported for intonation contours. Based on the same corpus, Coleman, Grabe and Kochanski (submitted) reported significant differences between Belfast and other dialects. At the same time, duration-based measures do not show the distinction between 'Northern' and 'Southern' dialects observed by Coleman et al., but separate the two dialects spoken by 'ethnic minorities'. This suggests complex patterns of interaction between different aspects of prosodic structure

    Patterns of durational variation in British dialects

    No full text
    We present the results of a quantitative study of durational variability in several British dialects. The study is based on the IViE corpus (Grabe 2001), which contains recordings of speakers from seven central urban dialects of English from the UK and Ireland. The corpus used in this study included speakers from Belfast, Cambridge, Newcastle, Leeds, Dublin, monolingual London residents of Jamaican decent and bilingual Bradford residents of Punjabi decent. We used reading and re-telling of "Cinderella" story by 6 female and 6 male speakers of each dialect (7 female and 5 male for London). The read version of Cinderella was split into 5 parts. The total total number of sound files used for analysis was 465. The corpus was automatically segmented into vocalic and consonantal intervals. We then computed 15 published measures of durational variability (for example, %V or VnPVI). To compare the variation between the dialects with variation within the dialects, we used classifier algorithms based on all possible combinations of 1, 2 or 3 measures (more than 2000 in total). We found that there is a substantial overlap between the dialects leading to a relatively low classification rate: based on various combinations of the rhythm measures, the classifiers could correctly identify the dialect of not more than 30% of data (chance=15%). We see an increase in performance of the classifiers based on greater number of rhythm measures. This shows that our previous finding that rhythm is a multidimensional phenomenon (Loukina et al. 2009) is true not only for different languages, but also for different dialects of the same language. Our classifiers showed higher identification rates for Belfast English. These results agree with the results previously reported by Ferragne (2004, 2008) who conducted a similar study on a different corpus of British dialects: they found that rhythm measures computed on Ulster English were different from other accents in their corpus. Contrary to their result, we find little overlap between Belfast and Dublin English. We also found better classification rates for the two 'ethnic minorities' accents: Punjabi and Jamaican The patterns we found based on measures of durational variability are partially similar to what has been reported for intonation contours. Based on the same corpus, Coleman, Grabe and Kochanski (submitted) reported significant differences between Belfast and other dialects. At the same time, duration-based measures do not show the distinction between 'Northern' and 'Southern' dialects observed by Coleman et al., but separate the two dialects spoken by 'ethnic minorities'. This suggests complex patterns of interaction between different aspects of prosodic structure

    Do rhythm measures separate languages or speakers?

    No full text
    Ever since Pike and Abercrombie had suggested that all languages can be divided into stress-timed and syllable-timed, the so-called `rhythmic differences' between the languages have attracted substantial attention from phoneticians. Although experimental studies so far found no evidence for isochrony as such, various quantitative statistical indices have been proposed to capture the rhythmic properties of languages. In this paper we compare 15 measures of durational variability based on an automatic segmentation of speech into vowel-like and consonant-like regions. Our corpus consisted of a total of 3059 short texts recorded from 62 speakers of Southern British English, Standard Greek, Standard Russian, Standard French and Taiwanese Mandarin. We used an automated algorithm to segment the data into vowel-like and consonant-like segments. This allowed us to apply identical segmentation criteria to all languages and to compute rhythm measures over a large corpus. To compare intra-group variation in rhythm measures (RMs) to inter-group variation, we applied classifier techniques. We measured how often we can correctly predict the language, based on one or more RMs. The performance of classifiers depended on the number of dimensions. While there was a significant difference in the performance of the classifiers based on single measures to classifiers based on three measures, there was only little improvement in the performance of classifiers based on more than three rhythm measures. This suggests that rhythm is at least a three-dimensional phenomenon and is best captured by a combination of more than two measures. The most efficient classifier based on all 45 rhythm measures correctly identified the language of 61% of the data (chance=30%). This shows that although there are rhythmic differences between languages, substantial variation within languages makes it impossible to reliably separate languages based on the rhythm of a single paragraph. At the same time, we have found that classifiers performed surprisingly well in identifying speakers of the same language. For example, for English classifiers based on three measures correctly identified the speaker of 48% of the data (chance=8%). Thus the differences between speakers of the same language appear to be more consistent than the differences between different languages. This finding raises interesting questions about the nature of individual variability in duration. It also shows that any future study requires a representative sample of speakers to avoid the danger of measuring differences between people rather than languages

    AixOx

    No full text
    corpus of read speechAixOx is a corpus of read speech: 40 one-minute passages (EUROM 1 corpus) in French and in English. The French passages are read by native speakers and English-speaking learners and the English passages are read by natives and French learners.It is hosted in ORTOLANG's Speech Language Data Repository (SLDR/ORTOLANG).AixOx est un corpus de lecture : 40 passages d’une minute environ en français et en anglais (corpus EUROM 1). Les passages en français sont lus par des natifs français et des apprenants anglophones, les passages en anglais sont lus par des natifs anglophones et des apprenants francophones.Ce corpus est hébergé sur la plateforme Speech Language Data Repository d'ORTOLANG (SLDR/ORTOLANG)

    Predicting Greek Cypriot children's reading and spelling from morphological and dialect awareness

    No full text
    The aim of this longitudinal study was to examine the contribution of phonological, morphological and dialect awareness to the prediction of reading and spelling in a Greek bi-dialectal setting. The target group (N=404) consisted of children, aged 6 to 9 years at the start of the project, who learn literacy in Cyprus, where a dialect is spoken in certain contexts but where Standard Modern Greek is also widely used. At present there are few studies with Greek Cypriot children on how phonological, morphological and dialect awareness relates to reading and spelling. Because there are no standardised measures of phonological, morphological and dialect awareness with Greek Cypriot children, measures of these factors were developed during the pilot study and their internal consistency was assessed. With the larger sample the measures were validated by examining their construct validity. The first wave of data collection showed that morphological and dialect awareness make unique contribution to the prediction of reading and spelling in Greek. The second wave of data collection showed that the measures of morphological and dialect awareness predicted performance in reading and spelling eight months later, even partialling out grade level, estimation of verbal intelligence and initial scores in reading and spelling. A model with dialect awareness as a mediator between phonological and morphological awareness and reading and spelling fitted the data better than a model with phonological or morphological awareness as mediators, and hence, phonological awareness and morphological awareness help children to become aware of the differences between their dialect and the standard variety, and dialect awareness in turn facilitates reading and spelling. Cross-lagged correlations showed that the more experience children have with reading and spelling, the more likely they are to develop morphological and dialect awareness. This study makes theoretical, empirical and practical educational contributions. The established mediational model contributes to the theoretical knowledge of the connection between dialect awareness and phonological and morphological awareness and reading and spelling while the longitudinal study contributes to theory the long term relation of morphological and dialect awareness with reading and spelling in Greek. Empirically, the study established the plausibility of a causal link between morphological and dialect awareness and reading and spelling, which must be tested in further research using intervention methods. In practice, this study contributes valid measures for assessing morphological and dialect awareness in the Greek Cypriot setting.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    AixOx, a multi-layered learners corpus: automatic annotation

    No full text
    International audienceno abstrac
    corecore