2,161 research outputs found

    Cross-entropy analysis of the information in forensic speaker recognition

    Full text link
    Proceedings of Odyssey 2008: The Speaker and Language Recognition Workshop, Stellenbosch, South AfricaIn this work we analyze the average information supplied by a forensic speaker recognition system in an information theoretical way. The objective is the transparent reporting of the performance of the system in terms of information, according to the needs of transparency and testability in forensic science. This analysis allows the derivation of a proper measure of goodness for forensic speaker recognition, the empirical cross-entropy (ECE), according to previous work in the literature. We also propose an intuitive representation, namely the ECE plot, which allows forensic scientists to explain the average information given by the evidence analysis process in a clear and intuitive way. Such representation allows the forensic scientist to assess the evidence evaluation process with independence of the prior information, which is province of the court. Then, fact finders may check the average information given by the evidence analysis with the incorporation of prior information. An experimental example following NIST SRE 2006 protocol is presented in order to highlight the adequacy of the proposed framework in the forensic inferential process. An example of the presentation of the average information supplied by the forensic analysis of the speech evidence in court is also provided, simulating a real case.This work has been supported by the Spanish Ministry of Education under project TEC2006-13170-C02-01

    Constrained speaker linking

    Get PDF
    In this paper we study speaker linking (a.k.a.\ partitioning) given constraints of the distribution of speaker identities over speech recordings. Specifically, we show that the intractable partitioning problem becomes tractable when the constraints pre-partition the data in smaller cliques with non-overlapping speakers. The surprisingly common case where speakers in telephone conversations are known, but the assignment of channels to identities is unspecified, is treated in a Bayesian way. We show that for the Dutch CGN database, where this channel assignment task is at hand, a lightweight speaker recognition system can quite effectively solve the channel assignment problem, with 93% of the cliques solved. We further show that the posterior distribution over channel assignment configurations is well calibrated.Comment: Submitted to Interspeech 2014, some typos fixe

    Prosodic-Enhanced Siamese Convolutional Neural Networks for Cross-Device Text-Independent Speaker Verification

    Full text link
    In this paper a novel cross-device text-independent speaker verification architecture is proposed. Majority of the state-of-the-art deep architectures that are used for speaker verification tasks consider Mel-frequency cepstral coefficients. In contrast, our proposed Siamese convolutional neural network architecture uses Mel-frequency spectrogram coefficients to benefit from the dependency of the adjacent spectro-temporal features. Moreover, although spectro-temporal features have proved to be highly reliable in speaker verification models, they only represent some aspects of short-term acoustic level traits of the speaker's voice. However, the human voice consists of several linguistic levels such as acoustic, lexicon, prosody, and phonetics, that can be utilized in speaker verification models. To compensate for these inherited shortcomings in spectro-temporal features, we propose to enhance the proposed Siamese convolutional neural network architecture by deploying a multilayer perceptron network to incorporate the prosodic, jitter, and shimmer features. The proposed end-to-end verification architecture performs feature extraction and verification simultaneously. This proposed architecture displays significant improvement over classical signal processing approaches and deep algorithms for forensic cross-device speaker verification.Comment: Accepted in 9th IEEE International Conference on Biometrics: Theory, Applications, and Systems (BTAS 2018

    Information-theoretical comparison of evidence evaluation methods for score-based biometric systems

    Full text link
    Ponencia presentada en la Seventh International Conference on Forensic Inference and Statistics, The University of Lausanne, Switzerland, August 2008Biometric systems are a powerful tool in many forensic disciplines in order to aid scientists to evaluate the weight of the evidence. However, uprising requirements of admissibility in forensic science demand scientific methods in order to test the accuracy of the forensic evidence evaluation process. In this work we analyze and compare several evidence analysis methods for score-based biometric systems. For all of them, the score given by the system is transformed into a likelihood ratio ( LR) which expresses the weight of the evidence. The accuracy of each LR computation method will be assessed by classical Tippett plots- We also propose measuring accuracy in terms of average information given by the evidence evaluation process, by means of Empirical Cross-Entropy (EC-E) plots. Preliminary results are presented using a voice biometric system and the NIST SRE 2006 experimental protocol

    Information-theoretical assessment of the performance of likelihood ratio computation methods

    Full text link
    This is the accepted version of the following article: Ramos, D., Gonzalez-Rodriguez, J., Zadora, G. and Aitken, C. (2013), Information-Theoretical Assessment of the Performance of Likelihood Ratio Computation Methods. Journal of Forensic Sciences, 58: 1503–1518. doi: 10.1111/1556-4029.12233, which has been published in final form at http://onlinelibrary.wiley.com/doi/10.1111/1556-4029.12233/Performance of likelihood ratio (LR) methods for evidence evaluation has been represented in the past using, for example, Tippett plots. We propose empirical cross-entropy (ECE) plots as a metric of accuracy based on the statistical theory of proper scoring rules, interpretable as information given by the evidence according to information theory, which quantify calibration of LR values. We present results with a case example using a glass database from real casework, comparing performance with both Tippett and ECE plots. We conclude that ECE plots allow clearer comparisons of LR methods than previous metrics, allowing a theoretical criterion to determine whether a given method should be used for evidence evaluation or not, which is an improvement over Tippett plots. A set of recommendations for the use of the proposed methodology by practitioners is also given.Supported by the Spanish Ministry of Science and Innovation under project TEC2009-14719-C02-01 and co-funded by the Universidad Autonoma de Madrid and the Comunidad Autonoma de Madrid under project CCG10-UAM/TIC-5792

    Speaker Identification for Swiss German with Spectral and Rhythm Features

    Get PDF
    We present results of speech rhythm analysis for automatic speaker identification. We expand previous experiments using similar methods for language identification. Features describing the rhythmic properties of salient changes in signal components are extracted and used in an speaker identification task to determine to which extent they are descriptive of speaker variability. We also test the performance of state-of-the-art but simple-to-extract frame-based features. The paper focus is the evaluation on one corpus (swiss german, TEVOID) using support vector machines. Results suggest that the general spectral features can provide very good performance on this dataset, whereas the rhythm features are not as successful in the task, indicating either the lack of suitability for this task or the dataset specificity
    • …
    corecore