8 research outputs found

    Automatic Assessment of Oral Reading Accuracy for Reading Diagnostics

    Full text link
    Automatic assessment of reading fluency using automatic speech recognition (ASR) holds great potential for early detection of reading difficulties and subsequent timely intervention. Precise assessment tools are required, especially for languages other than English. In this study, we evaluate six state-of-the-art ASR-based systems for automatically assessing Dutch oral reading accuracy using Kaldi and Whisper. Results show our most successful system reached substantial agreement with human evaluations (MCC = .63). The same system reached the highest correlation between forced decoding confidence scores and word correctness (r = .45). This system's language model (LM) consisted of manual orthographic transcriptions and reading prompts of the test data, which shows that including reading errors in the LM improves assessment performance. We discuss the implications for developing automatic assessment systems and identify possible avenues of future research

    A TEI-based Approach to Standardising Spoken Language Transcription

    Get PDF
    This paper formulates a proposal for standardising spoken language transcription, as practised in conversation analysis, sociolinguistics, dialectology and related fields, with the help of the TEI guidelines. Two areas relevant to standardisation are identified and discussed: first, the macro structure of transcriptions, as embodied in the data models and file formats of transcription tools such as ELAN, Praat or EXMARaLDA; second, the micro structure of transcriptions as embodied in transcription conventions such as CA, HIAT or GAT. A two-step process is described in which first the macro structure is represented in a generic TEI format based on elements defined in the P5 version of the Guidelines. In the second step, character data in this representation is parsed according to the regularities of a transcription convention resulting in a more fine-grained TEI markup which is also based on P5. It is argued that this two step process can, on the one hand, map idiosyncratic differences in tool formats and transcription conventions onto a unified representation. On the other hand, differences motivated by different theoretical decisions can be retained in a manner which still allows a common processing of data from different sources. In order to make the standard usable in practice, a conversion tool—TEI Drop—is presented which uses XSL transformations to carry out the conversion between different tool formats (CHAT, ELAN, EXMARaLDA, FOLKER and Transcriber) and the TEI representation of transcription macro structure (and vice versa) and which also provides methods for parsing the micro structure of transcriptions according to two different transcription conventions (HIAT and cGAT). Using this tool, transcribers can continue to work with software they are familiar with while still producing TEI-conformant transcription files. The paper concludes with a discussion of the work needed in order to establish the proposed standard. It is argued that both tool formats and the TEI guidelines are in a sufficiently mature state to serve as a basis for standardisation. Most work consequently remains in analysing and standardising differences between different transcription conventions

    Listening with great expectations: A study of predictive natural speech processing

    Get PDF

    Corpora e interpretazione simultanea

    Get PDF
    In questo volume è presentata una proposta di applicazione del corpus-based approach agli studi sull’interpretazione, con particolare riferimento alla modalità simultanea. A tal fine, sono discusse le principali questioni teorico-pratiche e metodologiche implicate nella creazione di corpora elettronici per lo studio e la didattica dell’interpretazione. Inoltre, sono illustrati due progetti di ricerca appartenenti ai Corpus-based Interpreting Studies (CIS) che hanno portato alla realizzazione di due corpora elettronici di interpretazione: EPIC (European Parliament Interpreting Corpus) e DIRSI-C (Directionality in Simultaneous Interpreting Corpus). Tali risorse linguistiche, la prima trilingue (italiano, inglese e spagnolo), la seconda bilingue (italiano e inglese), con i rispettivi archivi multimediali su cui si basano, offrono molteplici possibilità di indagine e attività didattiche, a conferma del grande potenziale di questo innovativo paradigma di ricerca, frutto dell’incontro interdisciplinare tra la linguistica dei corpora e gli studi sull’interpretazione

    A phonetic variationist study on Chilean speakers of English as a foreign language

    Get PDF
    Variationist research in the Labovian paradigm has traditionally looked at the structured heterogeneity found in first language (L1) speech. More recently this quantitative methodology has been applied to speakers acquiring a second language (L2), usually in immigrant settings. This research has shown that alongside well documented L2 acquisition processes, sociolinguistic patterns are also found, just as in native speech. This dissertation examines the speech of native speakers of Spanish acquiring English in Chile, extending traditional quantitative methodology to L2 contexts, specifically to English as a foreign language (EFL) situations. I examine the variation of four phonetic variables: voiceless alveolar fricative (ʃ), voiceless alveolar affricate (ʧ), and postvocalic (r), which range from stigmatised to prestigious in both Spanish and English; and voiced dental fricative (ð), which has been extensively documented in English, mainly constrained by linguistic factors. Through the analysis of the speech of eighteen university students, I seek to test, firstly, whether the patterns of variation characteristic of Chilean Spanish are transferred to English and secondly, whether the variation exhibited by native speakers of English is replicated in EFL contexts. The results suggest that: (1) the expected transfer of patterns from Chilean Spanish to English does not occur for the variables (ʃ) and (ʧ), and (2) the patterns found in non- native speech in EFL contexts replicates the patterns found in native speakers of English for the variables voiced dental fricative (ð) and postvocalic (r). Amongst the social factors considered, the effect of social class is shown to contribute to the variation of postvocalic (r) and (ʃ), as years of instruction in English did to the variation of (ʃ); in relation to the contribution of internal factors, it is found that phonetic environment and position have an effect on the varying use of (ʃ) and (ð). As predicted for (ð), the effect of purely linguistic factors is confirmed. Thus this study demonstrates that the notion of structured heterogeneity can be extended to contexts of EFL, especially in relation to the effect of internal constraints

    Orthographic transcription of the spoken Dutch corpus

    No full text
    corecore