9,500 research outputs found

    Real Time Animation of Virtual Humans: A Trade-off Between Naturalness and Control

    Get PDF
    Virtual humans are employed in many interactive applications using 3D virtual environments, including (serious) games. The motion of such virtual humans should look realistic (or ‘natural’) and allow interaction with the surroundings and other (virtual) humans. Current animation techniques differ in the trade-off they offer between motion naturalness and the control that can be exerted over the motion. We show mechanisms to parametrize, combine (on different body parts) and concatenate motions generated by different animation techniques. We discuss several aspects of motion naturalness and show how it can be evaluated. We conclude by showing the promise of combinations of different animation paradigms to enhance both naturalness and control

    Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing

    Full text link
    Free-viewpoint video conferencing allows a participant to observe the remote 3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint image is commonly synthesized using two pairs of transmitted texture and depth maps from two neighboring captured viewpoints via depth-image-based rendering (DIBR). To maintain high quality of synthesized images, it is imperative to contain the adverse effects of network packet losses that may arise during texture and depth video transmission. Towards this end, we develop an integrated approach that exploits the representation redundancy inherent in the multiple streamed videos a voxel in the 3D scene visible to two captured views is sampled and coded twice in the two views. In particular, at the receiver we first develop an error concealment strategy that adaptively blends corresponding pixels in the two captured views during DIBR, so that pixels from the more reliable transmitted view are weighted more heavily. We then couple it with a sender-side optimization of reference picture selection (RPS) during real-time video coding, so that blocks containing samples of voxels that are visible in both views are more error-resiliently coded in one view only, given adaptive blending will erase errors in the other view. Further, synthesized view distortion sensitivities to texture versus depth errors are analyzed, so that relative importance of texture and depth code blocks can be computed for system-wide RPS optimization. Experimental results show that the proposed scheme can outperform the use of a traditional feedback channel by up to 0.82 dB on average at 8% packet loss rate, and by as much as 3 dB for particular frames

    A Mimetic Strategy to Engage Voluntary Physical Activity In Interactive Entertainment

    Full text link
    We describe the design and implementation of a vision based interactive entertainment system that makes use of both involuntary and voluntary control paradigms. Unintentional input to the system from a potential viewer is used to drive attention-getting output and encourage the transition to voluntary interactive behaviour. The iMime system consists of a character animation engine based on the interaction metaphor of a mime performer that simulates non-verbal communication strategies, without spoken dialogue, to capture and hold the attention of a viewer. The system was developed in the context of a project studying care of dementia sufferers. Care for a dementia sufferer can place unreasonable demands on the time and attentional resources of their caregivers or family members. Our study contributes to the eventual development of a system aimed at providing relief to dementia caregivers, while at the same time serving as a source of pleasant interactive entertainment for viewers. The work reported here is also aimed at a more general study of the design of interactive entertainment systems involving a mixture of voluntary and involuntary control.Comment: 6 pages, 7 figures, ECAG08 worksho

    Text-based Editing of Talking-head Video

    No full text
    Editing talking-head video to change the speech content or to remove filler words is challenging. We propose a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the speaker has been modified, while maintaining a seamless audio-visual flow (i.e. no jump cuts). Our method automatically annotates an input talking-head video with phonemes, visemes, 3D face pose and geometry, reflectance, expression and scene illumination per frame. To edit a video, the user has to only edit the transcript, and an optimization strategy then chooses segments of the input corpus as base material. The annotated parameters corresponding to the selected segments are seamlessly stitched together and used to produce an intermediate video representation in which the lower half of the face is rendered with a parametric face model. Finally, a recurrent video generation network transforms this representation to a photorealistic video that matches the edited transcript. We demonstrate a large variety of edits, such as the addition, removal, and alteration of words, as well as convincing language translation and full sentence synthesis

    Data-driven synthesis of realistic human motion using motion graphs

    Get PDF
    Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2014.Thesis (Master's) -- Bilkent University, 2014.Includes bibliographical references leaves 53-56.Realistic human motions is an essential part of diverse range of media, such as feature films, video games and virtual environments. Motion capture provides realistic human motion data using sensor technology. However, motion capture data is not flexible. This drawback limits the utility of motion capture in practice. In this thesis, we propose a two-stage approach that makes the motion captured data reusable to synthesize new motions in real-time via motion graphs. Starting from a dataset of various motions, we construct a motion graph of similar motion segments and calculate the parameters, such as blending parameters, needed in the second stage. In the second stage, we synthesize a new human motion in realtime, depending on the blending techniques selected. Three different blending techniques, namely linear blending, cubic blending and anticipation-based blending, are provided to the user. In addition, motion clip preference approach, which is applied to the motion search algorithm, enable users to control the motion clip types in the result motion.Dirican, HüseyinM.S

    Supplementing Frequency Domain Interpolation Methods for Character Animation

    Get PDF
    The animation of human characters entails difficulties exceeding those met simulating objects, machines or plants. A person's gait is a product of nature affected by mood and physical condition. Small deviations from natural movement are perceived with ease by an unforgiving audience. Motion capture technology is frequently employed to record human movement. Subsequent playback on a skeleton underlying the character being animated conveys many of the subtleties of the original motion. Played-back recordings are of limited value, however, when integration in a virtual environment requires movements beyond those in the motion library, creating a need for the synthesis of new motion from pre-recorded sequences. An existing approach involves interpolation between motions in the frequency domain, with a blending space defined by a triangle network whose vertices represent input motions. It is this branch of character animation which is supplemented by the methods presented in this thesis, with work undertaken in three distinct areas. The first is a streamlined approach to previous work. It provides benefits including an efficiency gain in certain contexts, and a very different perspective on triangle network construction in which they become adjustable and intuitive user-interface devices with an increased flexibility allowing a greater range of motions to be blended than was possible with previous networks. Interpolation-based synthesis can never exhibit the same motion variety as can animation methods based on the playback of rearranged frame sequences. Limitations such as this were addressed by the second phase of work, with the creation of hybrid networks. These novel structures use properties of frequency domain triangle blending networks to seamlessly integrate playback-based animation within them. The third area focussed on was distortion found in both frequency- and time-domain blending. A new technique, single-source harmonic switching, was devised which greatly reduces it, and adds to the benefits of blending in the frequency domain
    corecore