70,332 research outputs found

    Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNN

    Full text link
    Call Centers have huge amount of audio data which can be used for achieving valuable business insights and transcription of phone calls is manually tedious task. An effective Automated Speech Recognition system can accurately transcribe these calls for easy search through call history for specific context and content allowing automatic call monitoring, improving QoS through keyword search and sentiment analysis. ASR for Call Center requires more robustness as telephonic environment are generally noisy. Moreover, there are many low-resourced languages that are on verge of extinction which can be preserved with help of Automatic Speech Recognition Technology. Urdu is the 10th10^{th} most widely spoken language in the world, with 231,295,440 worldwide still remains a resource constrained language in ASR. Regional call-center conversations operate in local language, with a mix of English numbers and technical terms generally causing a "code-switching" problem. Hence, this paper describes an implementation framework of a resource efficient Automatic Speech Recognition/ Speech to Text System in a noisy call-center environment using Chain Hybrid HMM and CNN-TDNN for Code-Switched Urdu Language. Using Hybrid HMM-DNN approach allowed us to utilize the advantages of Neural Network with less labelled data. Adding CNN with TDNN has shown to work better in noisy environment due to CNN's additional frequency dimension which captures extra information from noisy speech, thus improving accuracy. We collected data from various open sources and labelled some of the unlabelled data after analysing its general context and content from Urdu language as well as from commonly used words from other languages, primarily English and were able to achieve WER of 5.2% with noisy as well as clean environment in isolated words or numbers as well as in continuous spontaneous speech.Comment: 32 pages, 19 figures, 2 tables, preprin

    Emotion Recognition from Acted and Spontaneous Speech

    Get PDF
    Dizertační práce se zabývá rozpoznáním emočního stavu mluvčích z řečového signálu. Práce je rozdělena do dvou hlavních častí, první část popisuju navržené metody pro rozpoznání emočního stavu z hraných databází. V rámci této části jsou představeny výsledky rozpoznání použitím dvou různých databází s různými jazyky. Hlavními přínosy této části je detailní analýza rozsáhlé škály různých příznaků získaných z řečového signálu, návrh nových klasifikačních architektur jako je například „emoční párování“ a návrh nové metody pro mapování diskrétních emočních stavů do dvou dimenzionálního prostoru. Druhá část se zabývá rozpoznáním emočních stavů z databáze spontánní řeči, která byla získána ze záznamů hovorů z reálných call center. Poznatky z analýzy a návrhu metod rozpoznání z hrané řeči byly využity pro návrh nového systému pro rozpoznání sedmi spontánních emočních stavů. Jádrem navrženého přístupu je komplexní klasifikační architektura založena na fúzi různých systémů. Práce se dále zabývá vlivem emočního stavu mluvčího na úspěšnosti rozpoznání pohlaví a návrhem systému pro automatickou detekci úspěšných hovorů v call centrech na základě analýzy parametrů dialogu mezi účastníky telefonních hovorů.Doctoral thesis deals with emotion recognition from speech signals. The thesis is divided into two main parts; the first part describes proposed approaches for emotion recognition using two different multilingual databases of acted emotional speech. The main contributions of this part are detailed analysis of a big set of acoustic features, new classification schemes for vocal emotion recognition such as “emotion coupling” and new method for mapping discrete emotions into two-dimensional space. The second part of this thesis is devoted to emotion recognition using multilingual databases of spontaneous emotional speech, which is based on telephone records obtained from real call centers. The knowledge gained from experiments with emotion recognition from acted speech was exploited to design a new approach for classifying seven emotional states. The core of the proposed approach is a complex classification architecture based on the fusion of different systems. The thesis also examines the influence of speaker’s emotional state on gender recognition performance and proposes system for automatic identification of successful phone calls in call center by means of dialogue features.

    “Transfer Talk” in Talk about Writing in Progress: Two Propositions about Transfer of Learning

    Get PDF
    This article tracks the emergence of the concept of “transfer talk”—a concept distinct from transfer of learning—and teases out the implications of transfer talk for theories of transfer of learning. The concept of transfer talk was developed through a systematic examination of 30 writing center transcripts and is defined as “the talk through which individuals make visible their prior learning (in this case, about writing) or try to access the prior learning of someone else.” In addition to including a taxonomy of transfer talk and analysis of which types occur most often in this set of conferences, this article advances two propositions about the nature of transfer of learning: (1) transfer of learning may have an important social, even collaborative, component and (2) although meta-awareness about writing has long been recognized as valuable for transfer of learning, more automatized knowledge may play an important role as well

    Graph-based Features for Automatic Online Abuse Detection

    Full text link
    While online communities have become increasingly important over the years, the moderation of user-generated content is still performed mostly manually. Automating this task is an important step in reducing the financial cost associated with moderation, but the majority of automated approaches strictly based on message content are highly vulnerable to intentional obfuscation. In this paper, we discuss methods for extracting conversational networks based on raw multi-participant chat logs, and we study the contribution of graph features to a classification system that aims to determine if a given message is abusive. The conversational graph-based system yields unexpectedly high performance , with results comparable to those previously obtained with a content-based approach

    Satellite-aided mobile communications limited operational test in the trucking industry

    Get PDF
    An experiment with NASA's ATS-6 satellite, that demonstrates the practicality of satellite-aided land mobile communications is described. Satellite communications equipment for the experiment was designed so that it would be no more expensive, when mass produced, than conventional two-way mobile radio equipment. It embodied the operational features and convenience of present day mobile radios. Vehicle antennas 75 cm tall and 2 cm in diameter provided good commercial quality signals to and from trucks and jeeps. Operational applicability and usage data were gathered by installing the radio equipment in five long-haul tractor-trailer trucks and two Air Force search and rescue jeeps. Channel occupancy rates are reported. Air Force personnel found the satellite radio system extremely valuable in their search and rescue mission during maneuvers and actual rescue operations. Propagation data is subjectively analyzed and over 4 hours of random data is categorized and graded as to signal quality on a second by second basis. Trends in different topographic regions are reported. An overall communications reliability of 93% was observed despite low satellite elevation angles ranging from 9 to 24 degrees
    corecore