Search CORE

155,982 research outputs found

Empathic Agent Technology (EAT)

Author: Broek Egon L. van den
Publication venue: Brooklyn College
Publication date: 01/01/2005
Field of study

A new view on empathic agents is introduced, named: Empathic Agent Technology (EAT). It incorporates a speech analysis, which provides an indication for the amount of tension present in people. It is founded on an indirect physiological measure for the amount of experienced stress, defined as the variability of the fundamental frequency of the human voice. A thorough review of literature is provided on which the EAT is founded. In addition, the complete processing line of this measure is introduced. Hence, the first generally applicable, completely automated technique is introduced that enables the development of truly empathic agents

CiteSeerX

VU Research Portal

University of Twente Research Information

Execution of a Voice - Based Attendence System

Author: Barpanda Siddharth Sagar
Publication venue
Publication date: 11/05/2015
Field of study

Speech Recognition is the methodology of consequently perceiving a certain word talked by a specific speaker taking into account singular data included in speech waves. This system makes it conceivable to utilize the speaker's voice to confirm his/her personality and give controlled access to administrations like voice based biometrics, database access administrations, voice based dialling, phone message and remote access to PCs. Speech processing front end for extricating the feature set is a critical stage in any voice recognition system. The ideal list of capabilities is still not yet chosen however the limitless endeavours of scientists. There are numerous sorts of highlights, which are determined distinctively and have great effect on the acknowledgment rate. This project shows one of the strategies to extract the feature from a voice signal, which can be utilized as a part of speech acknowledgment system. The key is to change the speech wave to some kind of parametric representation (at an impressively lower data rate) for further examination and processing. This is frequently known as the voice processing front end. An extensive variety of potential outcomes exist for parametrically speaking to the discourse signal for the speaker acknowledgment undertaking, for example, Mel-Frequency Cepstrum Coefficients (MFCC), Linear Prediction Coding (LPC), and others. MFCC is maybe the best known and generally prominent, furthermore, these will be utilized as a part of this undertaking. MFCCs are in view of the known variety of the human ear’s discriminating transmission capacities with recurrence channels dispersed sprightly at low frequencies and logarithmically at high frequencies have been utilized to catch the phonetically essential qualities of discourse. Nonetheless, another key normal for discourse is semi stationary, i.e. it is brief time stationary which is contemplated and investigated utilizing brief time, recurrence space examination. In this project work, I have built a straightforward yet completed and agent automatic speaker recognition (ASR) framework, as connected to a voice based attention framework, i.e., a speech based access control system. To attain to this, I had to first made a relative investigation of the MFCC approach with the Time space approach for acknowledgment by simulating both these strategies utilizing MATLAB 7.0 and investigating the consistency of acknowledgment utilizing both the procedures. The voice based attendance system is based with respect to confined or one word recognition. A specific speaker articulates the secret word once in the instructional course so as to prepare and store the highlights of the entrance word. While in the testing session the speaker articulates the secret key again to accomplish acknowledgment if there is a match. The highlight vectors interesting to that speaker are acquired in the preparation stage and this is made utilization of later on to allow validation to the same speaker who at the end of the day expresses the same word in the testing stage. At this stage a gate crasher can likewise test the framework to test the inalienable security include by expressing the same word

ethesis@nitr

Online backchannel synthesis evaluation with the switching Wizard of Oz

Author: Heylen Dirk
Maat Mark ter
Poppe Ronald
Publication venue: Otto von Guericke University
Publication date: 01/01/2012
Field of study

In this paper, we evaluate a backchannel synthesis algorithm in an online conversation between a human speaker and a virtual listener. We adopt the Switching Wizard of Oz (SWOZ) approach to assess behavior synthesis algorithms online. A human speaker watches a virtual listener that is either controlled by a human listener or by an algorithm. The source switches at random intervals. Speakers indicate when they feel they are no longer talking to a human listener. Analysis of these responses reveals patterns of inappropriate behavior in terms of quantity and timing of backchannels

University of Twente Research Information

The BURCHAK corpus: a Challenge Data Set for Interactive Learning of Visually Grounded Word Meanings

Author: Eshghi Arash
Lemon Oliver Joseph
Mills Gregory
Yu Yanchao
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

We motivate and describe a new freely available human-human dialogue dataset for interactive learning of visually grounded word meanings through ostensive definition by a tutor to a learner. The data has been collected using a novel, character-by-character variant of the DiET chat tool (Healey et al., 2003; Mills and Healey, submitted) with a novel task, where a Learner needs to learn invented visual attribute words (such as " burchak " for square) from a tutor. As such, the text-based interactions closely resemble face-to-face conversation and thus contain many of the linguistic phenomena encountered in natural, spontaneous dialogue. These include self-and other-correction, mid-sentence continuations, interruptions, overlaps, fillers, and hedges. We also present a generic n-gram framework for building user (i.e. tutor) simulations from this type of incremental data, which is freely available to researchers. We show that the simulations produce outputs that are similar to the original data (e.g. 78% turn match similarity). Finally, we train and evaluate a Reinforcement Learning dialogue control agent for learning visually grounded word meanings, trained from the BURCHAK corpus. The learned policy shows comparable performance to a rule-based system built previously.Comment: 10 pages, THE 6TH WORKSHOP ON VISION AND LANGUAGE (VL'17

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref