10 research outputs found

    Visual Attention and Eye Gaze During Multiparty Conversations with Distractions

    Get PDF
    Our objective is to develop a computational model to predict visual attention behavior for an embodied conversational agent. During interpersonal interaction, gaze provides signal feedback and directs conversation flow. Simultaneously, in a dynamic environment, gaze also directs attention to peripheral movements. An embodied conversational agent should therefore employ social gaze not only for interpersonal interaction but also to possess human attention attributes so that its eyes and facial expression portray and convey appropriate distraction and engagement behaviors

    Enhancing virtual environment-based surgical teamwork training with non-verbal communication

    Get PDF
    Virtual reality simulations for training surgical skills are increasingly used in medical education and have been shown to improve patient outcome. While advances in hardware and simulation techniques have resulted in many commercial applications for training technical skills, most of these simulators are extremely expensive and do not consider non-technical skills like teamwork and communication. This is a major drawback since recent research suggests that a large percentage of mistakes in clinical settings are due to communication problems. In addition, training teamwork can also improve the efficiency of a surgical team and as such reduce costs and workload. We present an inexpensive camera-based system for capturing aspects of non-verbal communication of users participating in virtual environment-based teamwork simulations. This data can be used for the enhancement of virtual-environment-based simulations to increase the realism and effectiveness of team communication

    Measuring and modeling the perception of natural and unconstrained gaze in humans and machines

    Get PDF
    Humans are remarkably adept at interpreting the gaze direction of other individuals in their surroundings. This skill is at the core of the ability to engage in joint visual attention, which is essential for establishing social interactions. How accurate are humans in determining the gaze direction of others in lifelike scenes, when they can move their heads and eyes freely, and what are the sources of information for the underlying perceptual processes? These questions pose a challenge from both empirical and computational perspectives, due to the complexity of the visual input in real-life situations. Here we measure empirically human accuracy in perceiving the gaze direction of others in lifelike scenes, and study computationally the sources of information and representations underlying this cognitive capacity. We show that humans perform better in face-to-face conditions compared with recorded conditions, and that this advantage is not due to the availability of input dynamics. We further show that humans are still performing well when only the eyes-region is visible, rather than the whole face. We develop a computational model, which replicates the pattern of human performance, including the finding that the eyes-region contains on its own, the required information for estimating both head orientation and direction of gaze. Consistent with neurophysiological findings on task-specific face regions in the brain, the learned computational representations reproduce perceptual effects such as the Wollaston illusion, when trained to estimate direction of gaze, but not when trained to recognize objects or faces.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216

    Nonverbal behaviours improving a simulation of small group discussion

    Get PDF
    This paper reports on the development of a multi-agent simulation of small group discussion that focusses on the interaction and the coordination of turn-taking. We describe the addition of nonverbal behaviours, such as gaze, gestures, posture shifts and head and facial expression, to the model; how the agents in the simulation take these behaviours into account in their decisions to speak, to stop, or to give feedback. The simulation is to be evaluated comparing its statistical profile against the statistics generated by a simpler, base model, one without the nonverbal behaviours, to show that it better approximates the statistics of a real group discussion. The properties to be assessed include mean transition intervals, turn lengths, relation of gaze to speaking order, frequency of simultaneous starts, and of feedback.

    Lip syncing method for realistic expressive three-dimensional face model

    Get PDF
    Lip synchronization of 3D face model is now being used in a multitude of important fields. It brings a more human and dramatic reality to computer games, films and interactive multimedia, and is growing in use and importance. High level realism can be used in demanding applications such as computer games and cinema. Authoring lip syncing with complex and subtle expressions is still difficult and fraught with problems in terms of realism. Thus, this study proposes a lip syncing method of realistic expressive 3D face model. Animated lips require a 3D face model capable of representing the movement of face muscles during speech and a method to produce the correct lip shape at the correct time. The 3D face model is designed based on MPEG-4 facial animation standard to support lip syncing that is aligned with input audio file. It deforms using Raised Cosine Deformation function that is grafted onto the input facial geometry. This study also proposes a method to animate the 3D face model over time to create animated lip syncing using a canonical set of visemes for all pairwise combinations of a reduced phoneme set called ProPhone. Finally, this study integrates emotions by considering both Ekman model and Plutchik’s wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language to produce realistic 3D face model. The experimental results show that the proposed model can generate visually satisfactory animations with Mean Square Error of 0.0020 for neutral, 0.0024 for happy expression, 0.0020 for angry expression, 0.0030 for fear expression, 0.0026 for surprise expression, 0.0010 for disgust expression, and 0.0030 for sad expression

    Rendering and display for multi-viewer tele-immersion

    Get PDF
    Video teleconferencing systems are widely deployed for business, education and personal use to enable face-to-face communication between people at distant sites. Unfortunately, the two-dimensional video of conventional systems does not correctly convey several important non-verbal communication cues such as eye contact and gaze awareness. Tele-immersion refers to technologies aimed at providing distant users with a more compelling sense of remote presence than conventional video teleconferencing. This dissertation is concerned with the particular challenges of interaction between groups of users at remote sites. The problems of video teleconferencing are exacerbated when groups of people communicate. Ideally, a group tele-immersion system would display views of the remote site at the right size and location, from the correct viewpoint for each local user. However, is is not practical to put a camera in every possible eye location, and it is not clear how to provide each viewer with correct and unique imagery. I introduce rendering techniques and multi-view display designs to support eye contact and gaze awareness between groups of viewers at two distant sites. With a shared 2D display, virtual camera views can improve local spatial cues while preserving scene continuity, by rendering the scene from novel viewpoints that may not correspond to a physical camera. I describe several techniques, including a compact light field, a plane sweeping algorithm, a depth dependent camera model, and video-quality proxies, suitable for producing useful views of a remote scene for a group local viewers. The first novel display provides simultaneous, unique monoscopic views to several users, with fewer user position restrictions than existing autostereoscopic displays. The second is a random hole barrier autostereoscopic display that eliminates the viewing zones and user position requirements of conventional autostereoscopic displays, and provides unique 3D views for multiple users in arbitrary locations

    Effects of Gaze on Multiparty Mediated Communication

    No full text
    We evaluated effects of gaze direction and other nonverbal visual cues on multiparty mediated communication. Groups of three participants (two actors, one subject) solved language puzzles in three audiovisual communication conditions. Each condition presented a different selection of images of the actors to subjects: (1) frontal motion video; (2) motion video with gaze directional cues; (3) still images with gaze directional cues. Results show that subjects used twice as many deictic references to persons when head orientation cues were present. We also found a linear relationship between the amount of actor gaze perceived by subjects and the number of speaking turns taken by subjects. Lack of gaze can decrease turn-taking efficiency of multiparty mediated systems by 25%. This is because gaze conveys whether one is being addressed or expected to speak, and is used to regulate social intimacy. Support for gaze directional cues in multiparty mediated systems is therefore recommended
    corecore