2 research outputs found

    Tonal identification in whispered speech

    Get PDF
    This project aims to examine whether, and how, non-F0 cues facilitate the identification of lexical tones. A perception experiment is designed to explicitly test the impact of duration cues for Mandarin lexical tones when F0 is absent. We take a novel approach in which the secondary cue of interest is held constant, effectively controlling the type of information listeners receive. Future studies can potentially extend this methodology to examine other relevant cues, such as temporal envelope and intensity. The contribution of this paper is twofold: first, to propose an explanation for the inconsistent conclusions drawn in the literature on tonal identification in whispered speech; second, to devise a more well-controlled study shedding light on the nature of tonal perception

    Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings

    Full text link
    Automatic Speech Recognition (ASR) in medical contexts has the potential to save time, cut costs, increase report accuracy, and reduce physician burnout. However, the healthcare industry has been slower to adopt this technology, in part due to the importance of avoiding medically-relevant transcription mistakes. In this work, we present the Clinical BERTScore (CBERTScore), an ASR metric that penalizes clinically-relevant mistakes more than others. We demonstrate that this metric more closely aligns with clinician preferences on medical sentences as compared to other metrics (WER, BLUE, METEOR, etc), sometimes by wide margins. We collect a benchmark of 13 clinician preferences on 149 realistic medical sentences called the Clinician Transcript Preference benchmark (CTP), demonstrate that CBERTScore more closely matches what clinicians prefer, and release the benchmark for the community to further develop clinically-aware ASR metrics
    corecore