339,290 research outputs found

    Deep linguistic prehistory with particular reference to Andamanese *

    Get PDF
    In 1992, American linguist Johanna Nichols introduced a new method of detecting typological patterns at great time depths, based on the morphological analysis and cross-linguistic comparisons of several structural types and grammatical categories (Nichols 1992). She claimed that her method reveals patterns that may go back as far as the initial modern human colonization of the globe, and she set up a preliminary model of early linguistic spread. Has Nichols taken a ground-breaking step towards a greater understanding of our distant linguistic past? And how can we test this? Towards the end of her book, Nichols 1992:263-65 calls for an analysis of ‘critical’ languages which are in a unique position to fill the gaps in her study and thus essential to our understanding of global linguistic prehistory. Using Nichols’ method as a testing model, this article highlights one such critical language group – the Andamanese language family, spoken by the indigenous Negrito population on the Andaman Islands, in the Bay of Bengal – in an effort to shed further light on the distant linguistic past of our species

    BERT Meets CTC: New Formulation of End-to-End Speech Recognition with Pre-trained Masked Language Model

    Full text link
    This paper presents BERT-CTC, a novel formulation of end-to-end speech recognition that adapts BERT for connectionist temporal classification (CTC). Our formulation relaxes the conditional independence assumptions used in conventional CTC and incorporates linguistic knowledge through the explicit output dependency obtained by BERT contextual embedding. BERT-CTC attends to the full contexts of the input and hypothesized output sequences via the self-attention mechanism. This mechanism encourages a model to learn inner/inter-dependencies between the audio and token representations while maintaining CTC's training efficiency. During inference, BERT-CTC combines a mask-predict algorithm with CTC decoding, which iteratively refines an output sequence. The experimental results reveal that BERT-CTC improves over conventional approaches across variations in speaking styles and languages. Finally, we show that the semantic representations in BERT-CTC are beneficial towards downstream spoken language understanding tasks.Comment: v1: Accepted to Findings of EMNLP2022, v2: Minor corrections and clearer derivation of Eq. (21

    Exploring Subgroup Performance In End-to-End Speech Models

    Get PDF
    End-to-End Spoken Language Understanding models are generally evaluated according to their overall accuracy, or separately on (a priori defined) data subgroups of interest. We propose a technique for analyzing model performance at the subgroup level, which considers all subgroups that can be defined via a given set of metadata and are above a specified minimum size. The metadata can represent user characteristics, recording conditions, and speech targets. Our technique is based on advances in model bias analysis, enabling efficient exploration of resulting subgroups. A fine-grained analysis reveals how model performance varies across subgroups, identifying modeling issues or bias towards specific subgroups. We compare the subgroup-level performance of models based on wav2vec 2.0 and HuBERT on the Fluent Speech Commands dataset. The experimental results illustrate how subgroup-level analysis reveals a finer and more complete picture of performance changes when models are replaced, automatically identifying the subgroups that most benefit or fail to benefit from the chang
    • …
    corecore