27,996 research outputs found

    The Many Moods of Emotion

    Full text link
    This paper presents a novel approach to the facial expression generation problem. Building upon the assumption of the psychological community that emotion is intrinsically continuous, we first design our own continuous emotion representation with a 3-dimensional latent space issued from a neural network trained on discrete emotion classification. The so-obtained representation can be used to annotate large in the wild datasets and later used to trained a Generative Adversarial Network. We first show that our model is able to map back to discrete emotion classes with a objectively and subjectively better quality of the images than usual discrete approaches. But also that we are able to pave the larger space of possible facial expressions, generating the many moods of emotion. Moreover, two axis in this space may be found to generate similar expression changes as in traditional continuous representations such as arousal-valence. Finally we show from visual interpretation, that the third remaining dimension is highly related to the well-known dominance dimension from psychology

    Optimizing Network Coding Algorithms for Multiple Applications.

    Get PDF
    Deviating from the archaic communication approach of treating information as a fluid moving through pipes, the concepts of Network Coding (NC) suggest that optimal throughput of a multicast network can be achieved by processing information at individual network nodes. However, existing challenges to harness the advantages of NC concepts for practical applications have prevented the development of NC into an effective solution to increase the performance of practical communication networks. In response, the research work presented in this thesis proposes cross-layer NC solutions to increase the network throughput of data multicast as well as video quality of video multicast applications. First, three algorithms are presented to improve the throughput of NC enabled networks by minimizing the NC coefficient vector overhead, optimizing the NC redundancy allocation and improving the robustness of NC against bursty packet losses. Considering the fact that majority of network traffic occupies video, rest of the proposed NC algorithms are content-aware and are optimized for both data and video multicast applications. A set of content and network-aware optimization algorithms, which allocate redundancies for NC considering content properties as well as the network status, are proposed to efficiently multicast data and video across content delivery networks. Furthermore content and channel-aware joint channel and network coding algorithms are proposed to efficiently multicast data and video across wireless networks. Finally, the possibilities of performing joint source and network coding are explored to increase the robustness of high volume video multicast applications. Extensive simulation studies indicate significant improvements with the proposed algorithms to increase the network throughput and video quality over related state-of-the-art solutions. Hence, it is envisaged that the proposed algorithms will contribute to the advancement of data and video multicast protocols in the future communication networks

    Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation

    Full text link
    The recent advances in deep learning have made it possible to generate photo-realistic images by using neural networks and even to extrapolate video frames from an input video clip. In this paper, for the sake of both furthering this exploration and our own interest in a realistic application, we study image-to-video translation and particularly focus on the videos of facial expressions. This problem challenges the deep neural networks by another temporal dimension comparing to the image-to-image translation. Moreover, its single input image fails most existing video generation methods that rely on recurrent models. We propose a user-controllable approach so as to generate video clips of various lengths from a single face image. The lengths and types of the expressions are controlled by users. To this end, we design a novel neural network architecture that can incorporate the user input into its skip connections and propose several improvements to the adversarial training method for the neural network. Experiments and user studies verify the effectiveness of our approach. Especially, we would like to highlight that even for the face images in the wild (downloaded from the Web and the authors' own photos), our model can generate high-quality facial expression videos of which about 50\% are labeled as real by Amazon Mechanical Turk workers.Comment: 10 page

    Skeleton-aided Articulated Motion Generation

    Full text link
    This work make the first attempt to generate articulated human motion sequence from a single image. On the one hand, we utilize paired inputs including human skeleton information as motion embedding and a single human image as appearance reference, to generate novel motion frames, based on the conditional GAN infrastructure. On the other hand, a triplet loss is employed to pursue appearance-smoothness between consecutive frames. As the proposed framework is capable of jointly exploiting the image appearance space and articulated/kinematic motion space, it generates realistic articulated motion sequence, in contrast to most previous video generation methods which yield blurred motion effects. We test our model on two human action datasets including KTH and Human3.6M, and the proposed framework generates very promising results on both datasets.Comment: ACM MM 201
    • …
    corecore