186 research outputs found

    Classe de Ciências

    Get PDF
    info:eu-repo/semantics/publishedVersio

    Towards Multilingual Automatic Dialogue Evaluation

    Full text link
    The main limiting factor in the development of robust multilingual dialogue evaluation metrics is the lack of multilingual data and the limited availability of open sourced multilingual dialogue systems. In this work, we propose a workaround for this lack of data by leveraging a strong multilingual pretrained LLM and augmenting existing English dialogue data using Machine Translation. We empirically show that the naive approach of finetuning a pretrained multilingual encoder model with translated data is insufficient to outperform the strong baseline of finetuning a multilingual model with only source data. Instead, the best approach consists in the careful curation of translated data using MT Quality Estimation metrics, excluding low quality translations that hinder its performance.Comment: SIGDIAL2

    Disfluency Detection Across Domains

    Get PDF
    This paper focuses on disfluency detection across distinct domains using a large set of openSMILE features, derived from the Interspeech 2013 Paralinguistic challenge. Amongst different machine learning methods being applied, SVMs achieved the best performance. Feature selection experiments revealed that the dimensionality of the larger set of features can be further reduced at the cost of a small degradation. Different models trained with one corpus were tested on the other corpus, revealing that models can be quite robust across corpora for this task, despite their distinct nature. We have conducted additional experiments aiming at disfluency prediction in the context of IVR systems, and results reveal that there is no substantial degradation on the performance, encouraging the use of the models in IVR domains.info:eu-repo/semantics/publishedVersio

    Towards End-to-End Private Automatic Speaker Recognition

    Full text link
    The development of privacy-preserving automatic speaker verification systems has been the focus of a number of studies with the intent of allowing users to authenticate themselves without risking the privacy of their voice. However, current privacy-preserving methods assume that the template voice representations (or speaker embeddings) used for authentication are extracted locally by the user. This poses two important issues: first, knowledge of the speaker embedding extraction model may create security and robustness liabilities for the authentication system, as this knowledge might help attackers in crafting adversarial examples able to mislead the system; second, from the point of view of a service provider the speaker embedding extraction model is arguably one of the most valuable components in the system and, as such, disclosing it would be highly undesirable. In this work, we show how speaker embeddings can be extracted while keeping both the speaker's voice and the service provider's model private, using Secure Multiparty Computation. Further, we show that it is possible to obtain reasonable trade-offs between security and computational cost. This work is complementary to those showing how authentication may be performed privately, and thus can be considered as another step towards fully private automatic speaker recognition.Comment: Accepted for publication at Interspeech 202

    Assessing User Expertise in Spoken Dialog System Interactions

    Get PDF
    Identifying the level of expertise of its users is important for a system since it can lead to a better interaction through adaptation techniques. Furthermore, this information can be used in offline processes of root cause analysis. However, not much effort has been put into automatically identifying the level of expertise of an user, especially in dialog-based interactions. In this paper we present an approach based on a specific set of task related features. Based on the distribution of the features among the two classes - Novice and Expert - we used Random Forests as a classification approach. Furthermore, we used a Support Vector Machine classifier, in order to perform a result comparison. By applying these approaches on data from a real system, Let's Go, we obtained preliminary results that we consider positive, given the difficulty of the task and the lack of competing approaches for comparison.Comment: 10 page

    Privacy-preserving Automatic Speaker Diarization

    Full text link
    Automatic Speaker Diarization (ASD) is an enabling technology with numerous applications, which deals with recordings of multiple speakers, raising special concerns in terms of privacy. In fact, in remote settings, where recordings are shared with a server, clients relinquish not only the privacy of their conversation, but also of all the information that can be inferred from their voices. However, to the best of our knowledge, the development of privacy-preserving ASD systems has been overlooked thus far. In this work, we tackle this problem using a combination of two cryptographic techniques, Secure Multiparty Computation (SMC) and Secure Modular Hashing, and apply them to the two main steps of a cascaded ASD system: speaker embedding extraction and agglomerative hierarchical clustering. Our system is able to achieve a reasonable trade-off between performance and efficiency, presenting real-time factors of 1.1 and 1.6, for two different SMC security settings

    Privacy-oriented manipulation of speaker representations

    Full text link
    Speaker embeddings are ubiquitous, with applications ranging from speaker recognition and diarization to speech synthesis and voice anonymisation. The amount of information held by these embeddings lends them versatility, but also raises privacy concerns. Speaker embeddings have been shown to contain information on age, sex, health and more, which speakers may want to keep private, especially when this information is not required for the target task. In this work, we propose a method for removing and manipulating private attributes from speaker embeddings that leverages a Vector-Quantized Variational Autoencoder architecture, combined with an adversarial classifier and a novel mutual information loss. We validate our model on two attributes, sex and age, and perform experiments with ignorant and fully-informed attackers, and with in-domain and out-of-domain data

    Improving ASR error detection with non-decoder based features

    Get PDF
    Abstract This study reports error detection experiments in large vocabulary automatic speech recognition (ASR) systems, by using statistical classifiers. We explored new features gathered from other knowledge sources than the decoder itself: a binary feature that compares outputs from two different ASR systems (word by word), a feature based on the number of hits of the hypothesized bigrams, obtained by queries entered into a very popular Web search engine, and finally a feature related to automatically infered topics at sentence and word levels. Experiments were conducted on a European Portuguese broadcast news corpus. The combination of baseline decoder-based features and two of these additional features led to significant improvements, from 13.87% to 12.16% classification error rate (CER) with a maximum entropy model, and from 14.01% to 12.39% CER with linear-chain conditional random fields, comparing to a baseline using only decoder-based features

    Epigenética nas Perturbações de Ansiedade

    Get PDF
    Introdução: As perturbações de ansiedade são das perturbações psiquiátricas mais comuns, caracterizadas por medo excessivo, ansiedade e evitação de ameaças. São uma das principais causas de incapacidade mundial, com uma prevalência cada vez maior. Está normalmente associada a outras comorbilidades como depressão ou abuso de substâncias. A patogénese das perturbações de ansiedade é complexa, envolvendo interações entre fatores biológicos, influências do ambiente e mecanismos psicológicos. Objetivos: Esta dissertação tem como objetivo reunir a literatura sobre o papel dos fatores epigenéticos no desenvolvimento de perturbações de ansiedade. Métodos: A pesquisa foi realizada na plataforma PubMed entre setembro de 2021 até abril de 2022. Resultados: A análise da literatura sobre as influências da epigenética no desenvolvimento de perturbações de ansiedade implica o envolvimento de genes que regulam o eixo hipotálamohipofise-adrenal, sistemas neurotransmissores ou plasticidade neuronal, e vários estudos sugerem que a disrupção da expressão genética destes genes através de mecanismos de epigenética contribuem para a patogénese de perturbações de ansiedade. Conclusão: A investigação sobre qual a contribuição dos mecanismos epigenéticos para a suscetibilidade ou resistência para o desenvolvimento das perturbações de ansiedade ainda se encontra na sua infância pelo que será necessária mais pesquisa para clarificar detalhes nesta área.Introduction: Anxiety disorders are some of the most common psychiatric disorders. They’re characterized by excessive fear, anxiety, and threat avoidance. They’re one of the leading causes for disability worldwide, with a rising prevalence. It’s usually associated with comorbidities like depression and substance abuse. Anxiety disorder’s pathogenesis is complex, involving interactions between biology factors, environment influences and psychological mechanisms. Objectives: This dissertation aims to report on current literature findings about the role of epigenetic factors on the development of anxiety disorders. Methods: The research was done on the platform PubMed between the dates September 2021 until April 2022. Results: Current literature about epigenetics’ influence on the development of anxiety disorders implicates genes that regulate the axis hypothalamic-pituitary-adrenal, neurotransmitter systems, and neuroplasticity, and several studies suggest the disruption of the expression of these genes through epigenetic mechanisms contribute to the pathogenesis of AD. Conclusion: Research about the epigenetic mechanisms’ contribution towards susceptibility or resilience in developing anxiety disorders is still in its infancy and requires more more work to clarify details in the area
    corecore