74 research outputs found

    Language-independent speaker anonymization using orthogonal Householder neural network

    Full text link
    Speaker anonymization aims to conceal a speaker's identity while preserving content information in speech. Current mainstream neural-network speaker anonymization systems disentangle speech into prosody-related, content, and speaker representations. The speaker representation is then anonymized by a selection-based speaker anonymizer that uses a mean vector over a set of randomly selected speaker vectors from an external pool of English speakers. However, the resulting anonymized vectors are subject to severe privacy leakage against powerful attackers, reduction in speaker diversity, and language mismatch problems for unseen language speaker anonymization. To generate diverse, language-neutral speaker vectors, this paper proposes an anonymizer based on an orthogonal Householder neural network (OHNN). Specifically, the OHNN acts like a rotation to transform the original speaker vectors into anonymized speaker vectors, which are constrained to follow the distribution over the original speaker vector space. A basic classification loss is introduced to ensure that anonymized speaker vectors from different speakers have unique speaker identities. To further protect speaker identities, an improved classification loss and similarity loss are used to push original-anonymized sample pairs away from each other. Experiments on VoicePrivacy Challenge datasets in English and the AISHELL-3 dataset in Mandarin demonstrate the proposed anonymizer's effectiveness

    A New Time–Frequency Attention Tensor Network for Language Identification

    Get PDF
    In this paper, we aim to improve traditional DNN x-vector language identification (LID) performance by employing Wide Residual Networks (WRN) as a powerful feature extractor which we combine with a novel frequency attention network (F-ATN). Compared with conventional time attention, our method learns discriminative weights for different frequency bands to generate weighted means and standard deviations for utterance-level classification. This mechanism enables the architecture to direct attention to important frequency bands rather than important time frames, as in traditional time attention (T-ATN) methods. Furthermore, we then introduce a cross-layer frequency attention tensor network (CLF-ATN) which exploits information from different layers to recapture frame-level language characteristics that have been dropped by aggressive frequency pooling in lower layers. This effectively restores fine-grained discriminative language details. Finally, we explore the joint fusion of frame-level and frequency-band attention in a time-frequency attention network (TF-ATN). Experimental results show firstly that WRN can significantly outperform a traditional DNN x-vector implementation. Secondly, the proposed frequency attention method is more effective than time attention and thirdly that frequency-time score fusion can yield further improvement. Finally, extensive experiments on CLF-ATN demonstrate that it is able to improve discrimination by regaining dropped fine-grained frequency information, particularly for low dimension frequency features

    Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline

    Full text link
    The use of modern vocoders in an analysis/synthesis pipeline allows us to investigate high-quality voice conversion that can be used for privacy purposes. Here, we propose to transform the speaker embedding and the pitch in order to hide the sex of the speaker. ECAPA-TDNN-based speaker representation fed into a HiFiGAN vocoder is protected using a neural-discriminant analysis approach, which is consistent with the zero-evidence concept of privacy. This approach significantly reduces the information in speech related to the speaker's sex while preserving speech content and some consistency in the resulting protected voices.Comment: Accepted to ICASSP 202

    Improved Conditional Generative Adversarial Net Classification For Spoken Language Recognition

    Get PDF
    Recent research on generative adversarial nets (GAN) for language identification (LID) has shown promising results. In this paper, we further exploit the latent abilities of GAN networks to firstly combine them with deep neural network (DNN)-based i-vector approaches and then to improve the LID model using conditional generative adversarial net (cGAN) classification. First, phoneme dependent deep bottleneck features (DBF) combined with output posteriors of a pre-trained DNN for automatic speech recognition (ASR) are used to extract i-vectors in the normal way. These i-vectors are then classified using cGAN, and we show an effective method within the cGAN to optimize parameters by combining both language identification and verification signals as supervision. Results show firstly that cGAN methods can significantly outperform DBF DNN i-vector methods where 49-dimensional i-vectors are used, but not where 600-dimensional vectors are used. Secondly, training a cGAN discriminator network for direct classification has further benefit for low dimensional i-vectors as well as short utterances with high dimensional i-vectors. However, incorporating a dedicated discriminator network output layer for classification and optimizing both classification and verification loss brings benefits in all test cases

    Clinical Significance of PTEN Deletion, Mutation, and Loss of PTEN Expression in De Novo Diffuse Large B-Cell Lymphoma

    Get PDF
    PTEN loss has been associated with poorer prognosis in many solid tumors. However, such investigation in lymphomas is limited. In this study, PTEN cytoplasmic and nuclear expression, PTEN gene deletion, and PTEN mutations were evaluated in two independent cohorts of diffuse large B-cell lymphoma (DLBCL). Cytoplasmic PTEN expression was found in approximately 67% of total 747 DLBCL cases, more frequently in the activated B-cell-like subtype. Nuclear PTEN expression was less frequent and at lower levels, which significantly correlated with higher PTEN mRNA expression. Remarkably, loss of PTEN protein expression was associated with poorer survival only in DLBCL with AKT hyperactivation. In contrast, high PTEN expression was associated with Myc expression and poorer survival in cases without abnormal AKT activation. Genetic and epigenetic mechanisms for loss of PTEN expression were investigated. PTEN deletions (mostly heterozygous) were detected in 11.3% of DLBCL, and showed opposite prognostic effects in patients with AKT hyperactivation and in MYC rearranged DLBCL patients. PTEN mutations, detected in 10.6% of patients, were associated with upregulation of genes involved in central nervous system function, metabolism, and AKT/mTOR signaling regulation. Loss of PTEN cytoplasmic expression was also associated with TP53 mutations, higher PTEN-targeting microRNA expression, and lower PD-L1 expression. Remarkably, low PTEN mRNA expression was associated with down-regulation of a group of genes involved in immune responses and B-cell development/differentiation, and poorer survival in DLBCL independent of AKT activation. Collectively, multi-levels of PTEN abnormalities and dysregulation may play important roles in PTEN expression and loss, and that loss of PTEN tumor-suppressor function contributes to the poor survival of DLBCL patients with AKT hyperactivation

    SYNVOX2: Towards a privacy-friendly VOXCELEB2 dataset

    No full text

    Tourism and Yuan-based Strangership

    No full text
    Tourism is a dynamic way to encounter strangers. The deeply rooted Chinese concept of Yuan (缘) was drawn upon in this research to better understand individuals\u27 encounters with strangers during travel. Specifically, this qualitative study systematically conceptualized Yuan -based strangership in a tourism context. Interviews with Chinese emerging adults uncovered a cycle of stranger-dominated socioecological relationships involving the initiation, sociability, intensity, and evolvement of Yuan -based strangership. Results showed that Yuan -connected significant strangers served as partial spectators who helped tourists develop a sense of place in a destination. This study contributes to the literature on strangership, sense of place, self-identity, and emerging adulthood in relation to tourism. Findings also help the tourism industry, families, and individuals in facilitating and embracing Yuan during trips
    • …
    corecore