12 research outputs found

    Constrained speaker linking

    Get PDF
    In this paper we study speaker linking (a.k.a.\ partitioning) given constraints of the distribution of speaker identities over speech recordings. Specifically, we show that the intractable partitioning problem becomes tractable when the constraints pre-partition the data in smaller cliques with non-overlapping speakers. The surprisingly common case where speakers in telephone conversations are known, but the assignment of channels to identities is unspecified, is treated in a Bayesian way. We show that for the Dutch CGN database, where this channel assignment task is at hand, a lightweight speaker recognition system can quite effectively solve the channel assignment problem, with 93% of the cliques solved. We further show that the posterior distribution over channel assignment configurations is well calibrated.Comment: Submitted to Interspeech 2014, some typos fixe

    Performance of likelihood ratios considering bounds on the probability of observing misleading evidence

    Full text link
    This is a pre-copyedited, author-produced version of an article accepted for publication in Law, Probability & Risk following peer review. The version of record Jose Juan Lucena-Molina, Daniel Ramos-Castro, Joaquin Gonzalez-Rodriguez; Performance of likelihood ratios considering bounds on the probability of observing misleading evidence. Law, Probability and Risk 2015; 14 (3): 175-192 is available online at: http://dx.doi.org/10.1093/lpr/mgu022In this article, we introduce a new tool, namely 'Limit Tippett Plots', to assess the performance of likelihood ratios in evidence evaluation including theoretical bounds on the probability of observing misleading evidence. To do that, we first review previous work about such bounds. Then we derive 'Limit Tippett Plots' that complements Tippett plots with information about the limits on the probability of observing misleading evidence, which are taken as a reference. Thus, a much richer way to measure performance of likelihood ratios is given. Finally, we present an experimental example in forensic automatic speaker recognition following the protocols of the Acoustics Laboratory of Guardia Civil, where it can be seen that 'Limit Tippett Plots' help to detect problems in the calculation of likelihood ratios

    Measuring coherence of computer-assisted likelihood ratio methods

    Full text link
    This is the author’s version of a work that was accepted for publication in Forensic Science International. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Forensic Science International, 249 (2015): 123 – 132 DOI:10.1016/j.forsciint.2015.01.033Measuring the performance of forensic evaluation methods that compute likelihood ratios (LRs) is relevant for both the development and the validation of such methods. A framework of performance characteristics categorized as primary and secondary is introduced in this study to help achieve such development and validation. Ground-truth labelled fingerprint data is used to assess the performance of an example likelihood ratio method in terms of those performance characteristics. Discrimination, calibration, and especially the coherence of this LR method are assessed as a function of the quantity and quality of the trace fingerprint data. Assessment of the coherence revealed a weakness of the comparison algorithm in the computer-assisted likelihood ratio method used.This research was conducted in the scope of the BBfor2 – European Commission Marie Curie Initial Training Network (FP7-PEOPLE-ITN-2008 under Grant Agreement 238803) at the Netherlands Forensic Institute, and in collaboration with the ATVS Biometric Recognition Group at the Universidad Autonoma de Madrid and the National Police Services Agency of the Netherlands

    Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals

    Get PDF
    Recent years have seen growing efforts to develop spoofing countermeasures (CMs) to protect automatic speaker verification (ASV) systems from being deceived by manipulated or artificial inputs. The reliability of spoofing CMs is typically gauged using the equal error rate (EER) metric. The primitive EER fails to reflect application requirements and the impact of spoofing and CMs upon ASV and its use as a primary metric in traditional ASV research has long been abandoned in favour of risk-based approaches to assessment. This paper presents several new extensions to the tandem detection cost function (t-DCF), a recent risk-based approach to assess the reliability of spoofing CMs deployed in tandem with an ASV system. Extensions include a simplified version of the t-DCF with fewer parameters, an analysis of a special case for a fixed ASV system, simulations which give original insights into its interpretation and new analyses using the ASVspoof 2019 database. It is hoped that adoption of the t-DCF for the CM assessment will help to foster closer collaboration between the anti-spoofing and ASV research communities.Comment: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (doi updated

    A Speaker Verification Backend with Robust Performance across Conditions

    Full text link
    In this paper, we address the problem of speaker verification in conditions unseen or unknown during development. A standard method for speaker verification consists of extracting speaker embeddings with a deep neural network and processing them through a backend composed of probabilistic linear discriminant analysis (PLDA) and global logistic regression score calibration. This method is known to result in systems that work poorly on conditions different from those used to train the calibration model. We propose to modify the standard backend, introducing an adaptive calibrator that uses duration and other automatically extracted side-information to adapt to the conditions of the inputs. The backend is trained discriminatively to optimize binary cross-entropy. When trained on a number of diverse datasets that are labeled only with respect to speaker, the proposed backend consistently and, in some cases, dramatically improves calibration, compared to the standard PLDA approach, on a number of held-out datasets, some of which are markedly different from the training data. Discrimination performance is also consistently improved. We show that joint training of the PLDA and the adaptive calibrator is essential -- the same benefits cannot be achieved when freezing PLDA and fine-tuning the calibrator. To our knowledge, the results in this paper are the first evidence in the literature that it is possible to develop a speaker verification system with robust out-of-the-box performance on a large variety of conditions

    In the context of forensic casework, are there meaningful metrics of the degree of calibration?

    Get PDF
    Forensic-evaluation systems should output likelihood-ratio values that are well calibrated. If they do not, their output will be misleading. Unless a forensic-evaluation system is intrinsically well-calibrated, it should be calibrated using a parsimonious parametric model that is trained using calibration data. The system should then be tested using validation data. Metrics of degree of calibration that are based on the pool-adjacent-violators (PAV) algorithm recalibrate the likelihood-ratio values calculated from the validation data. The PAV algorithm overfits on the validation data because it is both trained and tested on the validation data, and because it is a non-parametric model with weak constraints. For already-calibrated systems, PAV-based ostensive metrics of degree of calibration do not actually measure degree of calibration; they measure sampling variability between the calibration data and the validation data, and overfitting on the validation data. Monte Carlo simulations are used to demonstrate that this is the case. We therefore argue that, in the context of casework, PAV-based metrics are not meaningful metrics of degree of calibration; however, we also argue that, in the context of casework, a metric of degree of calibration is not required

    The distribution of calibrated likelihood-ratios in speaker recognition

    Get PDF
    Contains fulltext : 116267.pdf (author's version ) (Open Access)Interspeech 2013, 25 augustus 201
    corecore