Search CORE

41,110 research outputs found

ANGELICA : choice of output modality in an embodied agent

Author: Theune Mariët
Publication venue
Publication date: 01/01/2001
Field of study

The ANGELICA project addresses the problem of modality choice in information presentation by embodied, humanlike agents. The output modalities available to such agents include both language and various nonverbal signals such as pointing and gesturing. For each piece of information to be presented by the agent it must be decided whether it should be expressed using language, a nonverbal signal, or both. In the ANGELICA project a model of the different factors influencing this choice will be developed and integrated in a natural language generation system. The application domain is the presentation of route descriptions by an embodied agent in a 3D environment. Evaluation and testing form an integral part of the project. In particular, we will investigate the effect of different modality choices on the effectiveness and naturalness of the generated presentations and on the user's perception of the agent's personality

University of Twente Research Information

Recommended from our members

Roadmap for Music Information ReSearch

Author: Benetos E.
Chudy M.
Dixon S.
Flexer A.
Gomez E.
Gouyon F.
Herrera P.
Jorda S.
Magas M.
Paytuvi O.
Peeters G.
Schlüter J.
Serra X.
Vinet H.
Widmer G.
Publication venue: MIRES Consortium
Publication date: 01/01/2013
Field of study

City Research Online

UPF Digital Repository

Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals

Author: Nagasaka Shogo
Nakashima Ryo
Taniguchi Tadahiro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/03/2016
Field of study

Human infants can discover words directly from unsegmented speech signals without any explicitly labeled data. In this paper, we develop a novel machine learning method called nonparametric Bayesian double articulation analyzer (NPB-DAA) that can directly acquire language and acoustic models from observed continuous speech signals. For this purpose, we propose an integrative generative model that combines a language model and an acoustic model into a single generative model called the "hierarchical Dirichlet process hidden language model" (HDP-HLM). The HDP-HLM is obtained by extending the hierarchical Dirichlet process hidden semi-Markov model (HDP-HSMM) proposed by Johnson et al. An inference procedure for the HDP-HLM is derived using the blocked Gibbs sampler originally proposed for the HDP-HSMM. This procedure enables the simultaneous and direct inference of language and acoustic models from continuous speech signals. Based on the HDP-HLM and its inference procedure, we developed a novel double articulation analyzer. By assuming HDP-HLM as a generative model of observed time series data, and by inferring latent variables of the model, the method can analyze latent double articulation structure, i.e., hierarchically organized latent words and phonemes, of the data in an unsupervised manner. The novel unsupervised double articulation analyzer is called NPB-DAA. The NPB-DAA can automatically estimate double articulation structure embedded in speech signals. We also carried out two evaluation experiments using synthetic data and actual human continuous speech signals representing Japanese vowel sequences. In the word acquisition and phoneme categorization tasks, the NPB-DAA outperformed a conventional double articulation analyzer (DAA) and baseline automatic speech recognition system whose acoustic model was trained in a supervised manner.Comment: 15 pages, 7 figures, Draft submitted to IEEE Transactions on Autonomous Mental Development (TAMD

arXiv.org e-Print Archive

Affective Medicine: a review of Affective Computing efforts in Medical Informatics

Author: Bamidis Panagiotis
Konstantinidis Evdokimos
Luneski Andrej
Publication venue: 'Georg Thieme Verlag KG'
Publication date: 01/01/2010
Field of study

Background: Affective computing (AC) is concerned with emotional interactions performed with and through computers. It is defined as “computing that relates to, arises from, or deliberately influences emotions”. AC enables investigation and understanding of the relation between human emotions and health as well as application of assistive and useful technologies in the medical domain. Objectives: 1) To review the general state of the art in AC and its applications in medicine, and 2) to establish synergies between the research communities of AC and medical informatics. Methods: Aspects related to the human affective state as a determinant of the human health are discussed, coupled with an illustration of significant AC research and related literature output. Moreover, affective communication channels are described and their range of application fields is explored through illustrative examples. Results: The presented conferences, European research projects and research publications illustrate the recent increase of interest in the AC area by the medical community. Tele-home healthcare, AmI, ubiquitous monitoring, e-learning and virtual communities with emotionally expressive characters for elderly or impaired people are few areas where the potential of AC has been realized and applications have emerged. Conclusions: A number of gaps can potentially be overcome through the synergy of AC and medical informatics. The application of AC technologies parallels the advancement of the existing state of the art and the introduction of new methods. The amount of work and projects reviewed in this paper witness an ambitious and optimistic synergetic future of the affective medicine field

Crossref

White Rose Research Online

I Probe, Therefore I Am: Designing a Virtual Journalist with Human Emotions

Author: Bowden Kevin K.
Cengiz Kubra
Ghitulescu Alexandru
Nilsson Tommy
Spencer Christine P.
van Waterschoot Jelte B.
Publication venue
Publication date: 18/05/2017
Field of study

By utilizing different communication channels, such as verbal language, gestures or facial expressions, virtually embodied interactive humans hold a unique potential to bridge the gap between human-computer interaction and actual interhuman communication. The use of virtual humans is consequently becoming increasingly popular in a wide range of areas where such a natural communication might be beneficial, including entertainment, education, mental health research and beyond. Behind this development lies a series of technological advances in a multitude of disciplines, most notably natural language processing, computer vision, and speech synthesis. In this paper we discuss a Virtual Human Journalist, a project employing a number of novel solutions from these disciplines with the goal to demonstrate their viability by producing a humanoid conversational agent capable of naturally eliciting and reacting to information from a human user. A set of qualitative and quantitative evaluation sessions demonstrated the technical feasibility of the system whilst uncovering a number of deficits in its capacity to engage users in a way that would be perceived as natural and emotionally engaging. We argue that naturalness should not always be seen as a desirable goal and suggest that deliberately suppressing the naturalness of virtual human interactions, such as by altering its personality cues, might in some cases yield more desirable results.Comment: eNTERFACE16 proceeding

arXiv.org e-Print Archive

University of Twente Research Information

Recommended from our members

Mobile Audiovisual Terminal: System Design and Subjective Testing in DECT and UMTS networks

Author: Cosmas J
Gill D
Pearmain A
Publication venue: IEEE*
Publication date: 01/07/2000
Field of study

It is anticipated that there will shortly be a requirement for multimedia terminals that operate via mobile communications systems. This paper presents a functional specification for such a terminal operating at 32 kb/s in a digital European cordless telecommunications (DECT) and universal mobile telecommunications system (UMTS) radio network. A terminal has been built, based on a PC with digital signal processor (DSP) boards for audio and video coding and decoding. Speech coding is by a phonetically driven code-excited linear prediction (CELP) speech coder and video coding by a block-oriented hybrid discrete cosine transform (DCT) coder. Separate channel coding is provided for the audio and video data. The paper describes the techniques used for audio and video coding, channel coding, and synchronization. Methods of subjective testing in a DECT network and in a UMTS network are also described. These consisted of subjective tests of first impressions of the mobile audio–visual terminal (MAVT) quality, interactive tests, and the completion of an exit questionnaire. The test results showed that the quality of the audio was sufficiently good for comprehension and the video was sufficiently good for following and repeating simple mechanical tasks. However, the quality of the MAVT was not good enough for general use where high-quality audio and video was needed, especially when transmission was in a noisy radio environment

Brunel University Research Archive