Visual Speech Synthesis Using Photorealistic Images

Abstract

Abstract—With the advances of multimedia technology, the synthesis of natural video scenes has become an important area focused on the reduction of the bit rate on specific channels. In this paper we propose two methods to obtain and use a codebook acquired from a training video which are later used to generate visual speech scenes with natural appearance. Two different classifiers have been explored, the LBG (Linde, Buzo, Gray) algorithm with a modified error criterion, and the SOFM (Self Organization Feature Map) algorithm. These vector quantizers act in a parametric domain. The parameters are obtained by applying a multiresolution transform to each frame of the video, therefore keeping frequency and space information. Index Terms—visual speech, vector quantizer, selforganization feature maps, minimax criterion

    Similar works

    Full text

    thumbnail-image

    Available Versions