41,110 research outputs found
ANGELICA : choice of output modality in an embodied agent
The ANGELICA project addresses the problem of modality choice in information presentation by embodied, humanlike agents. The output modalities available to such agents include both language and various nonverbal signals such as pointing and gesturing. For each piece of information to be presented by the agent it must be decided whether it should be expressed using language, a nonverbal signal, or both. In the ANGELICA project a model of the different factors influencing this choice will be developed and integrated in a natural language generation system. The application domain is the presentation of route descriptions by an embodied agent in a 3D environment. Evaluation and testing form an integral part of the project. In particular, we will investigate the effect of different modality choices on the effectiveness and naturalness of the generated presentations and on the user's perception of the agent's personality
Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals
Human infants can discover words directly from unsegmented speech signals
without any explicitly labeled data. In this paper, we develop a novel machine
learning method called nonparametric Bayesian double articulation analyzer
(NPB-DAA) that can directly acquire language and acoustic models from observed
continuous speech signals. For this purpose, we propose an integrative
generative model that combines a language model and an acoustic model into a
single generative model called the "hierarchical Dirichlet process hidden
language model" (HDP-HLM). The HDP-HLM is obtained by extending the
hierarchical Dirichlet process hidden semi-Markov model (HDP-HSMM) proposed by
Johnson et al. An inference procedure for the HDP-HLM is derived using the
blocked Gibbs sampler originally proposed for the HDP-HSMM. This procedure
enables the simultaneous and direct inference of language and acoustic models
from continuous speech signals. Based on the HDP-HLM and its inference
procedure, we developed a novel double articulation analyzer. By assuming
HDP-HLM as a generative model of observed time series data, and by inferring
latent variables of the model, the method can analyze latent double
articulation structure, i.e., hierarchically organized latent words and
phonemes, of the data in an unsupervised manner. The novel unsupervised double
articulation analyzer is called NPB-DAA.
The NPB-DAA can automatically estimate double articulation structure embedded
in speech signals. We also carried out two evaluation experiments using
synthetic data and actual human continuous speech signals representing Japanese
vowel sequences. In the word acquisition and phoneme categorization tasks, the
NPB-DAA outperformed a conventional double articulation analyzer (DAA) and
baseline automatic speech recognition system whose acoustic model was trained
in a supervised manner.Comment: 15 pages, 7 figures, Draft submitted to IEEE Transactions on
Autonomous Mental Development (TAMD
Affective Medicine: a review of Affective Computing efforts in Medical Informatics
Background: Affective computing (AC) is concerned with emotional interactions performed with and through computers. It is defined as “computing that relates to, arises from, or deliberately influences emotions”. AC enables investigation and understanding of the relation between human emotions and health as well as application of assistive and useful technologies in the medical domain. Objectives: 1) To review the general state of the art in AC and its applications in medicine, and 2) to establish synergies between the research communities of AC and medical informatics. Methods: Aspects related to the human affective state as a determinant of the human health are discussed, coupled with an illustration of significant AC research and related literature output. Moreover, affective communication channels are described and their range of application fields is explored through illustrative examples. Results: The presented conferences, European research projects and research publications illustrate the recent increase of interest in the AC area by the medical community. Tele-home healthcare, AmI, ubiquitous monitoring, e-learning and virtual communities with emotionally expressive characters for elderly or impaired people are few areas where the potential of AC has been realized and applications have emerged. Conclusions: A number of gaps can potentially be overcome through the synergy of AC and medical informatics. The application of AC technologies parallels the advancement of the existing state of the art and the introduction of new methods. The amount of work and projects reviewed in this paper witness an ambitious and optimistic synergetic future of the affective medicine field
I Probe, Therefore I Am: Designing a Virtual Journalist with Human Emotions
By utilizing different communication channels, such as verbal language,
gestures or facial expressions, virtually embodied interactive humans hold a
unique potential to bridge the gap between human-computer interaction and
actual interhuman communication. The use of virtual humans is consequently
becoming increasingly popular in a wide range of areas where such a natural
communication might be beneficial, including entertainment, education, mental
health research and beyond. Behind this development lies a series of
technological advances in a multitude of disciplines, most notably natural
language processing, computer vision, and speech synthesis. In this paper we
discuss a Virtual Human Journalist, a project employing a number of novel
solutions from these disciplines with the goal to demonstrate their viability
by producing a humanoid conversational agent capable of naturally eliciting and
reacting to information from a human user. A set of qualitative and
quantitative evaluation sessions demonstrated the technical feasibility of the
system whilst uncovering a number of deficits in its capacity to engage users
in a way that would be perceived as natural and emotionally engaging. We argue
that naturalness should not always be seen as a desirable goal and suggest that
deliberately suppressing the naturalness of virtual human interactions, such as
by altering its personality cues, might in some cases yield more desirable
results.Comment: eNTERFACE16 proceeding
Recommended from our members
Mobile Audiovisual Terminal: System Design and Subjective Testing in DECT and UMTS networks
It is anticipated that there will shortly be a requirement
for multimedia terminals that operate via mobile
communications systems. This paper presents a functional specification
for such a terminal operating at 32 kb/s in a digital
European cordless telecommunications (DECT) and universal
mobile telecommunications system (UMTS) radio network. A terminal
has been built, based on a PC with digital signal processor
(DSP) boards for audio and video coding and decoding. Speech
coding is by a phonetically driven code-excited linear prediction
(CELP) speech coder and video coding by a block-oriented hybrid
discrete cosine transform (DCT) coder. Separate channel coding
is provided for the audio and video data. The paper describes the
techniques used for audio and video coding, channel coding, and
synchronization. Methods of subjective testing in a DECT network
and in a UMTS network are also described. These consisted of
subjective tests of first impressions of the mobile audio–visual
terminal (MAVT) quality, interactive tests, and the completion
of an exit questionnaire. The test results showed that the quality
of the audio was sufficiently good for comprehension and the
video was sufficiently good for following and repeating simple
mechanical tasks. However, the quality of the MAVT was not
good enough for general use where high-quality audio and video
was needed, especially when transmission was in a noisy radio
environment
- …