4,579 research outputs found
Speech-driven Animation with Meaningful Behaviors
Conversational agents (CAs) play an important role in human computer
interaction. Creating believable movements for CAs is challenging, since the
movements have to be meaningful and natural, reflecting the coupling between
gestures and speech. Studies in the past have mainly relied on rule-based or
data-driven approaches. Rule-based methods focus on creating meaningful
behaviors conveying the underlying message, but the gestures cannot be easily
synchronized with speech. Data-driven approaches, especially speech-driven
models, can capture the relationship between speech and gestures. However, they
create behaviors disregarding the meaning of the message. This study proposes
to bridge the gap between these two approaches overcoming their limitations.
The approach builds a dynamic Bayesian network (DBN), where a discrete variable
is added to constrain the behaviors on the underlying constraint. The study
implements and evaluates the approach with two constraints: discourse functions
and prototypical behaviors. By constraining on the discourse functions (e.g.,
questions), the model learns the characteristic behaviors associated with a
given discourse class learning the rules from the data. By constraining on
prototypical behaviors (e.g., head nods), the approach can be embedded in a
rule-based system as a behavior realizer creating trajectories that are timely
synchronized with speech. The study proposes a DBN structure and a training
approach that (1) models the cause-effect relationship between the constraint
and the gestures, (2) initializes the state configuration models increasing the
range of the generated behaviors, and (3) captures the differences in the
behaviors across constraints by enforcing sparse transitions between shared and
exclusive states per constraint. Objective and subjective evaluations
demonstrate the benefits of the proposed approach over an unconstrained model.Comment: 13 pages, 12 figures, 5 table
Towards virtual communities on the Web: Actors and audience
We report about ongoing research in a virtual
reality environment where visitors can interact with
agents that help them to obtain information, to perform
certain transactions and to collaborate with them in order
to get some tasks done. Our environment models a
theatre in our hometown. We discuss attempts to let this
environment evolve into a theatre community where we
do not only have goal-directed visitors, but also visitors
that that are not sure whether they want to buy or just
want information or visitors who just want to look
around. It is shown that we need a multi-user and multiagent
environment to realize our goals. Since our environment
models a theatre it is also interesting to investigate
the roles of performers and audience in this environment.
For that reason we discuss capabilities and personalities of agents. Some notes on the historical development of networked communities are included
Virtual Meeting Rooms: From Observation to Simulation
Much working time is spent in meetings and, as a consequence, meetings have become the subject of multidisciplinary research. Virtual Meeting Rooms (VMRs) are 3D virtual replicas of meeting rooms, where various modalities such as speech, gaze, distance, gestures and facial expressions can be controlled. This allows VMRs to be used to improve remote meeting participation, to visualize multimedia data and as an instrument for research into social interaction in meetings. This paper describes how these three uses can be realized in a VMR. We describe the process from observation through annotation to simulation and a model that describes the relations between the annotated features of verbal and non-verbal conversational behavior.\ud
As an example of social perception research in the VMR, we describe an experiment to assess human observersā accuracy for head orientation
Refining personal and social presence in virtual meetings
Virtual worlds show promise for conducting meetings and conferences without the need for physical travel. Current experience suggests the major limitation to the more widespread adoption and acceptance of virtual conferences is the failure of existing environments to provide a sense of immersion and engagement, or of ābeing thereā. These limitations are largely related to the appearance and control of avatars, and to the absence of means to convey non-verbal cues of facial expression and body language. This paper reports on a study involving the use of a mass-market motion sensor (Kinectā¢) and the mapping of participant action in the real world to avatar behaviour in the virtual world. This is coupled with full-motion video representation of participantās faces on their avatars to resolve both identity and facial expression issues. The outcomes of a small-group trial meeting based on this technology show a very positive reaction from participants, and the potential for further exploration of these concepts
Experimenting with the Gaze of a Conversational Agent
We have carried out a pilot experiment to investigate the effects of different eye gaze behaviors of a cartoon-like talking face on the quality of human-agent dialogues. We compared a version of the talking face that roughly implements some patterns of humanlike behavior with two other versions. We called this the optimal version. In one of the other versions the shifts in gaze were kept minimal and in the other version the shifts would occur randomly. The talking face has a number of restrictions. There is no speech recognition, so questions and replies have to\ud
be typed in by the users of the systems. Despite this restriction we found that participants that conversed with the optimal agent appreciated the agent more than participants that conversed with the other agents. Conversations with the optimal version proceeded more efficiently. Participants needed less time to complete their task
- ā¦