14 research outputs found

    Effectiveness in the Realisation of Speaker Authentication

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.An important consideration for the deployment of speaker recognition in authentication applications is the approach to the formation of training and testing utterances . Whilst defining this for a specific scenario is influenced by the associated requirements and conditions, the process can be further guided through the establishment of the relative usefulness of alternative frameworks for composing the training and testing material. In this regard, the present paper provides an analysis of the effects, on the speaker recognition accuracy, of various bases for the formation of the training and testing data. The experimental investigations are conducted based on the use of digit utterances taken from the XM2VTS database. The paper presents a detailed description of the individual approaches considered and discusses the experimental results obtained in different cases

    Singing voice separation based on non-vocal independent component subtraction and amplitude discrimination

    Get PDF
    Copyright Institute of Electronic Music and AcousticsMany applications of Music Information Retrieval can benefit from effective isolation of the music sources. Earlier work by the authors led to the development of a system that is based on Azimuth Discrimination and Resynthesis (ADRess) and can extract the singing voice from reverberant stereophonic mixtures. We propose an extension to our previous method that is not based on ADRess and exploits both channels of the stereo mix more effectively. For the evaluation of the system we use a dataset that contains songs convolved during mastering as well as the mixing process (i.e. “real-world” conditions). The metrics for objective evaluation are based on bss_eval

    Privacy Protection Performance of De-identified Face Images with and without Background

    Get PDF
    Li Meng, 'Privacy Protection Performance of De-identified Face Images with and without Background', paper presented at the 39th International Information and Communication Technology (ICT) Convention. Grand Hotel Adriatic Congress Centre and Admiral Hotel, Opatija, Croatia, May 30 - June 3, 2016.This paper presents an approach to blending a de-identified face region with its original background, for the purpose of completing the process of face de-identification. The re-identification risk of the de-identified FERET face images has been evaluated for the k-Diff-furthest face de-identification method, using several face recognition benchmark methods including PCA, LBP, HOG and LPQ. The experimental results show that the k-Diff-furthest face de-identification delivers high privacy protection within the face region while blending the de-identified face region with its original background may significantly increases the re-identification risk, indicating that de-identification must also be applied to image areas beyond the face region

    De-identification for privacy protection in multimedia content : A survey

    Get PDF
    This document is the Accepted Manuscript version of the following article: Slobodan Ribaric, Aladdin Ariyaeeinia, and Nikola Pavesic, ‘De-identification for privacy protection in multimedia content: A survey’, Signal Processing: Image Communication, Vol. 47, pp. 131-151, September 2016, doi: https://doi.org/10.1016/j.image.2016.05.020. This manuscript version is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License CC BY NC-ND 4.0 (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.Privacy is one of the most important social and political issues in our information society, characterized by a growing range of enabling and supporting technologies and services. Amongst these are communications, multimedia, biometrics, big data, cloud computing, data mining, internet, social networks, and audio-video surveillance. Each of these can potentially provide the means for privacy intrusion. De-identification is one of the main approaches to privacy protection in multimedia contents (text, still images, audio and video sequences and their combinations). It is a process for concealing or removing personal identifiers, or replacing them by surrogate personal identifiers in personal information in order to prevent the disclosure and use of data for purposes unrelated to the purpose for which the information was originally obtained. Based on the proposed taxonomy inspired by the Safe Harbour approach, the personal identifiers, i.e., the personal identifiable information, are classified as non-biometric, physiological and behavioural biometric, and soft biometric identifiers. In order to protect the privacy of an individual, all of the above identifiers will have to be de-identified in multimedia content. This paper presents a review of the concepts of privacy and the linkage among privacy, privacy protection, and the methods and technologies designed specifically for privacy protection in multimedia contents. The study provides an overview of de-identification approaches for non-biometric identifiers (text, hairstyle, dressing style, license plates), as well as for the physiological (face, fingerprint, iris, ear), behavioural (voice, gait, gesture) and soft-biometric (body silhouette, gender, age, race, tattoo) identifiers in multimedia documents.Peer reviewe

    A test of the effectiveness of speaker verification for differentiating between identical twins

    Get PDF
    ABSTRACT This paper presents investigations into the ability of speaker verification technology to discriminate between identical twins. It is shown that whilst, in general, the genetic and non-genetic characteristics of voice are both of value to speaker verification capabilities, it is the latter which is highly beneficial in the separation of the speech of identical twins. It is further demonstrated that through the use of unconstrained cohort normalisation as a complementary means for the exploitation of such voice characteristics, the verification reliability can be considerably enhanced for both identical twins and unrelated speakers. Experiments were conducted using a bespoke clean-speech database consisting of utterances from forty nine identical twin pairs. The paper details the problem in speaker verification posed by identical twins, discusses the experimental investigations and provides an analysis of the results

    Qualitative fusion of normalised scores in multimodal biometrics

    Get PDF
    Original articles can be found at : http://www.sciencedirect.com/ Copyright ElsevierA new approach to enhancing the accuracy of multimodal biometrics is investigated. The proposed approach, which involves combining score normalisation and qualitative-based fusion, is shown to considerably improve the accuracy of multimodal biometrics under different data conditions.Peer reviewe

    Analysis and Comparison of Score Normalisation Methods for Text-Dependent Speaker Verification

    Get PDF
    This paper presents an investigation into the relative effectiveness of various score normalisation methods for speaker verification. The study provides a thorough analysis of different approaches for normalising verification scores, and comparatively examines these under identical experimental conditions. The experiments are based on the use of subsets of the Brent (telephone quality) speech database, consisting of repetitions of isolated digit utterances zero to nine spoken by native English speakers. Based on the experimental results it is demonstrated that amongst the considered methods, a particular form of the cohort normalisation method provides the best performance in terms of the verification accuracy. The paper discusses details of the experimental study and presents an analysis of the results.Final Published versio

    Sub-Band Based Text-Dependent Speaker Verification

    Get PDF
    Original article can be found at: http://www.sciencedirect.com/science/journal/01676393 --Copyright Elsevier B.V.This paper addresses various issues involved in sub-band based text-dependent speaker verification. The first part of the discussions is concerned with the classification methods. An important issue addressed in this part is the determination of a set of weights which emphasises the sub-bands that are specific to the target speaker while de-emphasising or removing the contaminated ones. In particular, techniques for determining these weights dynamically according to the level of contamination in the sub-bands are described. Furthermore, the effectiveness of these methods is experimentally analysed through a set of comparative studies. The second part of the discussions focuses on the feature extraction process. Analytically, it is shown that for a sub-band system of S bands, the cepstral coefficients with the quefrency of p have a strong linear relationship to the (S×p)th full-band cepstral parameter. With the aid of a set of experimental results, it is demonstrated that this means the conventional classification methods adapted to work with sub-band cepstral parameters may not be able to capture all the useful spectral information contained in the full-band cepstral parameters. In order to tackle this problem, two methods are described and their relative effectiveness is experimentally examined. The experimental investigations also include an examination of speaker discrimination abilities of different sub-bands and an analysis of different possible recombination levels.Peer reviewe

    Open-set speaker identification with diverse-duration speech data

    No full text
    Rawande Karadaghi, Heinz Hertlein, and, Aladdin Ariyaeeinia, "Open-set speaker identification with diverse-duration speech data", in Proceedings of SPIE 9457, Biometric and Surveillance Technology for Human Activity Identification XII, Baltimore, USA, 20 April 2015. DOI:10.1117/12.217633

    Singing Voice Separation Based On Non-vocal Independent Component Subtraction and Amplitude Discrimination

    Get PDF
    Many applications of Music Information Retrieval can benefit from effective isolation of the music sources. Earlier work by the authors led to the development of a system that is based on Azimuth Discrimination and Resynthesis (ADRess) and can extract the singing voice from reverberant stereophonic mixtures. We propose an extension to our previous method that is not based on ADRess and exploits both channels of the stereo mix more effectively. For the evaluation of the system we use a dataset that contains songs convolved during mastering as well as the mixing process (i.e. “real-world” conditions). The metrics for objective evaluation are based on bss_eval
    corecore