7 research outputs found

    An application of sparse representation in Gaussian mixture models used inspeech recognition task

    No full text
    U ovoj disertaciji je predstavljen model koji aproksimira pune kova- rijansne matrice u modelu gausovih mešavina (GMM) sa smanjenim brojem parametara i izračunavanja koji su potrebni za izračunavanje izglednosti. U predloženom modelu inverzne kovarijansne matrice su aproksimirane korišćenjem retke reprezentacije njihovih karakteri- stičnih vektora. Pored samog modela prikazan je i algoritam za estimaciju parametara zasnovan na kriterijumu maksimizacije izgeldnosti. Eksperimentalni rezultati na problemu prepoznavanja govora su pokazali da predloženi model za isti nivo greške kao GMM sa upunim kovarijansnim, redukuje broj parametara za 45%.This thesis proposes a model which approximates full covariance matrices in Gaussian mixture models with a reduced number of parameters and computations required for likelihood evaluations. In the proposed model inverse covariance (precision) matrices are approximated using sparsely represented eigenvectors. A maximum likelihood algorithm for parameter estimation and its practical implementation are presented. Experimental results on a speech recognition task show that while keeping the word error rate close to the one obtained by GMMs with full covariance matrices, the proposed model can reduce the number of parameters by 45%

    USER-AWARENESS AND ADAPTATION IN CONVERSATIONAL AGENTS

    Get PDF
    This paper considers the research question of developing user-aware and adaptive conversational agents. The conversational agent is a system which is user-aware to the extent that it recognizes the user identity and his/her emotional states that are relevant in a given interaction domain. The conversational agent is user-adaptive to the extent that it dynamically adapts its dialogue behavior according to the user and his/her emotional state. The paper summarizes some aspects of our previous work and presents work-in-progress in the field of speech-based human-machine interaction. It focuses particularly on the development of speech recognition modules in cooperation with both modules for emotion recognition and speaker recognition, as well as the dialogue management module. Finally, it proposes an architecture of a conversational agent that integrates those modules and improves each of them based on some kind of synergies among themselves

    The holistic perspective of the INCISIVE project : artificial intelligence in screening mammography

    Get PDF
    Finding new ways to cost-effectively facilitate population screening and improve cancer diagnoses at an early stage supported by data-driven AI models provides unprecedented opportunities to reduce cancer related mortality. This work presents the INCISIVE project initiative towards enhancing AI solutions for health imaging by unifying, harmonizing, and securely sharing scattered cancer-related data to ensure large datasets which are critically needed to develop and evaluate trustworthy AI models. The adopted solutions of the INCISIVE project have been outlined in terms of data collection, harmonization, data sharing, and federated data storage in compliance with legal, ethical, and FAIR principles. Experiences and examples feature breast cancer data integration and mammography collection, indicating the current progress, challenges, and future directions

    An application of sparse representation in Gaussian mixture models used inspeech recognition task

    Get PDF
    U ovoj disertaciji je predstavljen model koji aproksimira pune kova- rijansne matrice u modelu gausovih mešavina (GMM) sa smanjenim brojem parametara i izračunavanja koji su potrebni za izračunavanje izglednosti. U predloženom modelu inverzne kovarijansne matrice su aproksimirane korišćenjem retke reprezentacije njihovih karakteri- stičnih vektora. Pored samog modela prikazan je i algoritam za estimaciju parametara zasnovan na kriterijumu maksimizacije izgeldnosti. Eksperimentalni rezultati na problemu prepoznavanja govora su pokazali da predloženi model za isti nivo greške kao GMM sa upunim kovarijansnim, redukuje broj parametara za 45%.This thesis proposes a model which approximates full covariance matrices in Gaussian mixture models with a reduced number of parameters and computations required for likelihood evaluations. In the proposed model inverse covariance (precision) matrices are approximated using sparsely represented eigenvectors. A maximum likelihood algorithm for parameter estimation and its practical implementation are presented. Experimental results on a speech recognition task show that while keeping the word error rate close to the one obtained by GMMs with full covariance matrices, the proposed model can reduce the number of parameters by 45%

    Hybrid methodological approach to context-dependent speech recognition

    No full text
    Although the importance of contextual information in speech recognition has been acknowledged for a long time now, it has remained clearly underutilized even in state-of-the-art speech recognition systems. This article introduces a novel, methodologically hybrid approach to the research question of context-dependent speech recognition in human–machine interaction. To the extent that it is hybrid, the approach integrates aspects of both statistical and representational paradigms. We extend the standard statistical pattern-matching approach with a cognitively inspired and analytically tractable model with explanatory power. This methodological extension allows for accounting for contextual information which is otherwise unavailable in speech recognition systems, and using it to improve post-processing of recognition hypotheses. The article introduces an algorithm for evaluation of recognition hypotheses, illustrates it for concrete interaction domains, and discusses its implementation within two prototype conversational agents
    corecore