4 research outputs found

    Deriving and Exploiting Situational Information in Speech: Investigations in a Simulated Search and Rescue Scenario

    Get PDF
    The need for automatic recognition and understanding of speech is emerging in tasks involving the processing of large volumes of natural conversations. In application domains such as Search and Rescue, exploiting automated systems for extracting mission-critical information from speech communications has the potential to make a real difference. Spoken language understanding has commonly been approached by identifying units of meaning (such as sentences, named entities, and dialogue acts) for providing a basis for further discourse analysis. However, this fine-grained identification of fundamental units of meaning is sensitive to high error rates in the automatic transcription of noisy speech. This thesis demonstrates that topic segmentation and identification techniques can be employed for information extraction from spoken conversations by being robust to such errors. Two novel topic-based approaches are presented for extracting situational information within the search and rescue context. The first approach shows that identifying the changes in the context and content of first responders' report over time can provide an estimation of their location. The second approach presents a speech-based topological map estimation technique that is inspired, in part, by automatic mapping algorithms commonly used in robotics. The proposed approaches are evaluated on a goal-oriented conversational speech corpus, which has been designed and collected based on an abstract communication model between a first responder and a task leader during a search process. Results have confirmed that a highly imperfect transcription of noisy speech has limited impact on the information extraction performance compared with that obtained on the transcription of clean speech data. This thesis also shows that speech recognition accuracy can benefit from rescoring its initial transcription hypotheses based on the derived high-level location information. A new two-pass speech decoding architecture is presented. In this architecture, the location estimation from a first decoding pass is used to dynamically adapt a general language model which is used for rescoring the initial recognition hypotheses. This decoding strategy has resulted in a statistically significant gain in the recognition accuracy of the spoken conversations in high background noise. It is concluded that the techniques developed in this thesis can be extended to more application domains that deal with large volumes of natural spoken conversations

    Learning discrete word embeddings to achieve better interpretability and processing efficiency

    Full text link
    L’omniprésente utilisation des plongements de mot dans le traitement des langues naturellesest la preuve de leur utilité et de leur capacité d’adaptation a une multitude de tâches. Ce-pendant, leur nature continue est une importante limite en terme de calculs, de stockage enmémoire et d’interprétation. Dans ce travail de recherche, nous proposons une méthode pourapprendre directement des plongements de mot discrets. Notre modèle est une adaptationd’une nouvelle méthode de recherche pour base de données avec des techniques dernier crien traitement des langues naturelles comme les Transformers et les LSTM. En plus d’obtenirdes plongements nécessitant une fraction des ressources informatiques nécéssaire à leur sto-ckage et leur traitement, nos expérimentations suggèrent fortement que nos représentationsapprennent des unités de bases pour le sens dans l’espace latent qui sont analogues à desmorphèmes. Nous appelons ces unités dessememes, qui, de l’anglaissemantic morphemes,veut dire morphèmes sémantiques. Nous montrons que notre modèle a un grand potentielde généralisation et qu’il produit des représentations latentes montrant de fortes relationssémantiques et conceptuelles entre les mots apparentés.The ubiquitous use of word embeddings in Natural Language Processing is proof of theirusefulness and adaptivity to a multitude of tasks. However, their continuous nature is pro-hibitive in terms of computation, storage and interpretation. In this work, we propose amethod of learning discrete word embeddings directly. The model is an adaptation of anovel database searching method using state of the art natural language processing tech-niques like Transformers and LSTM. On top of obtaining embeddings requiring a fractionof the resources to store and process, our experiments strongly suggest that our representa-tions learn basic units of meaning in latent space akin to lexical morphemes. We call theseunitssememes, i.e., semantic morphemes. We demonstrate that our model has a greatgeneralization potential and outputs representation showing strong semantic and conceptualrelations between related words

    Automatic chat transcription on a firefighter TETRA Broadcast channel

    No full text
    For a reliable keyword extraction on firefighter radio communication, a strong automatic speech recognition system is needed. However, real-life data poses several challenges like a distorted voice signal, background noise and several different speakers. In this paper, we review our experiences with the PRONTO corpus, which has been recorded during a firefighting exercise. Then, we proceed to present the benchmarks of our chat transcription system in terms of word error rate. Since a large amount of sentences in public safety communication share similar patterns, we also analyse the impact of the sentence complexity on the system performance, and further investigate how much training utterances are needed for a reliable speaker identification in this setting

    The end of stigma? Understanding the dynamics of legitimisation in the context of TV series consumption

    Get PDF
    This research contributes to prior work on stigmatisation by looking at stigmatisation and legitimisation as social processes in the context of TV series consumption. Using in-depth interviews, we show that the dynamics of legitimisation are complex and accompanied by the reproduction of existing stigmas and creation of new stigmas
    corecore