15 research outputs found

    Timing in talking: What is it used for, and how is it controlled?

    Get PDF
    In the first part of the paper, we summarize the linguistic factors that shape speech timing patterns, including the prosodic structures which govern them, and suggest that speech timing patterns are used to aid utterance recognition. In the spirit of optimal control theory, we propose that recognition requirements are balanced against requirements such as rate of speech and style, as well as movement costs, to yield (near-)optimal planned surface timing patterns; additional factors may influence the implementation of that plan. In the second part of the paper, we discuss theories of timing control in models of speech production and motor control. We present three types of evidence that support models of speech production that involve extrinsic timing. These include (i) increasing variability with increases in interval duration, (ii) evidence that speakers refer to and plan surface durations, and (iii) independent timing of movement onsets and offsets

    Statistical Evaluation of Biometric Evidence in Forensic Automatic Speaker Recognition

    No full text
    Forensic speaker recognition is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). This paper aims at presenting forensic automatic speaker recognition (FASR) methods that provide a coherent way of quantifying and presenting recorded voice as biometric evidence. In such methods, the biometric evidence consists of the quantified degree of similarity between speaker-dependent features extracted from the trace and speaker-depeiident features extracted from recorded speech of a suspect. The interpretation of recorded voice as evidence in the forensic context presents particular challenges, including within-speaker (within-source) variability and between-speakers (between-sources) variability. Consequently, FASR methods must provide a statistical evaluation which gives the court all indication of the strength of the evidence given the estimated with in-source and between-sources variabilities. This paper reports on the first ENFSI evaluation campaign through a fake case, organized by the Netherlands Forensic Institute (NFI), as an example, where an automatic method using the Gaussian mixture models (GMMs) and the Bayesian interpretation (BI) framework were implemented for the forensic speaker recognition task
    corecore