21,222 research outputs found
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses
Automatically evaluating the quality of dialogue responses for unstructured
domains is a challenging problem. Unfortunately, existing automatic evaluation
metrics are biased and correlate very poorly with human judgements of response
quality. Yet having an accurate automatic evaluation procedure is crucial for
dialogue research, as it allows rapid prototyping and testing of new models
with fewer expensive human evaluations. In response to this challenge, we
formulate automatic dialogue evaluation as a learning problem. We present an
evaluation model (ADEM) that learns to predict human-like scores to input
responses, using a new dataset of human response scores. We show that the ADEM
model's predictions correlate significantly, and at a level much higher than
word-overlap metrics such as BLEU, with human judgements at both the utterance
and system-level. We also show that ADEM can generalize to evaluating dialogue
models unseen during training, an important step for automatic dialogue
evaluation.Comment: ACL 201
Semi-Autonomous Avatars: A New Direction for Expressive User Embodiment
Computer animated characters are rapidly becoming a regular part of our lives. They are starting to take the place of actors in films and television and are now an integral part of most computer games. Perhaps most interestingly in on-line games and chat rooms they are representing the user visually in the form of avatars, becoming our on-line identities, our embodiments in a virtual world. Currently online environments such as “Second Life” are being taken up by people who would not traditionally have considered playing games before, largely due to a greater emphasis on social interaction. These environments require avatars that are more expressive and that can make on-line social interactions seem more like face-to-face conversations.
Computer animated characters come in many different forms. Film characters require a substantial amount of off-line animator effort to achieve high levels of quality; these techniques are not suitable for real time applications and are not the focus of this chapter. Non-player characters (typically the bad guys) in games use limited artificial intelligence to react autonomously to events in real time. However avatars are completely controlled by their users, reacting to events solely through user commands. This chapter will discuss the distinction between fully autonomous characters and completely controlled avatars and how the current differentiation may no longer be useful, given that avatar technology may need to include more autonomy to live up to the demands of mass appeal. We will firstly discuss the two categories and present reasons to combine them. We will then describe previous work in this area and finally present our own framework for semi-autonomous avatars
Recommended from our members
Narrative skills in adolescents with a history of SLI in relation to non-verbal IQ scores
There is a debate about whether the language of children with primary language disorders and normal cognitive levels is qualitatively different from those with language impairments who have low or borderline non-verbal IQ (NVIQ). As children reach adolescence, this distinction may be even harder to ascertain, especially in naturalistic settings. Narrative may provide a useful, ecologically valid way in which to assess the language ability of adolescents with specific language impairment (SLI) who have intact or lowered NVIQ and to determine whether there is any discernable difference in every day language. Nineteen adolescents with a history of SLI completed two narrative tasks: a story telling condition and a conversational condition. Just under half the group (n = 8) had non-verbal IQs of 85. The remaining 11 had NVIQs in the normal range or above. Four areas of narrative (productivity, syntax, cohesion and performance) were assessed. There were no differences between the groups on standardized tests of language. However, the group with low NVIQ were poorer on most aspects of narrative, suggesting that cognitive level is important, even when language is the primary disorder. The groups showed similar patterns of differences between story telling and conversational narrative. It was concluded that adolescents with a history of SLI and poor cognitive levels have poorer narrative skills than those with normal range NVIQ even though these may not be detected by standardized assessment. Their difficulties present as qualitatively similar to those with normal range NVIQ and narratives appear impoverished rather than inaccurate
Character expression for spoken dialogue systems with semi-supervised learning using Variational Auto-Encoder
Character of spoken dialogue systems is important not only for giving a positive impression of the system but also for gaining rapport from users. We have proposed a character expression model for spoken dialogue systems. The model expresses three character traits (extroversion, emotional instability, and politeness) of spoken dialogue systems by controlling spoken dialogue behaviors: utterance amount, backchannel, filler, and switching pause length. One major problem in training this model is that it is costly and time-consuming to collect many pair data of character traits and behaviors. To address this problem, semi-supervised learning is proposed based on a variational auto-encoder that exploits both the limited amount of labeled pair data and unlabeled corpus data. It was confirmed that the proposed model can express given characters more accurately than a baseline model with only supervised learning. We also implemented the character expression model in a spoken dialogue system for an autonomous android robot, and then conducted a subjective experiment with 75 university students to confirm the effectiveness of the character expression for specific dialogue scenarios. The results showed that expressing a character in accordance with the dialogue task by the proposed model improves the user’s impression of the appropriateness in formal dialogue such as job interview
- …