7 research outputs found

    Speech synchronized 2D facial animation based on phonetic context dependent visemes images

    Get PDF
    Orientador: Jose Mario De MartinoDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de ComputaçãoResumo: A animação facial por computador sincronizada com a fala permite a implementação de cabeças virtuais que podem contribuir para tornar interfaces humano-computador mais eficientes e atraentes. O presente trabalho apresenta um método de síntese de animação facial 2D baseado em imagens cujo desenvolvimento foi guiado por dois objetivos principais: a reprodução realista da movimentação articulatória visível da fala, incluindo os efeitos da coarticulação, e a possibilidade de implementação do método mesmo em plataformas com capacidades limitadas de processamento e memória, tais como celulares e assistentes pessoais digitais. O método desenvolvido baseia-se em uma base de imagens de visemas dependentes de contexto para o Português do Brasil e adota a técnica de metamorfose entre visemas para a síntese da animação facial. A abordagem proposta representa uma estratégia de síntese alternativa e inovadora, capaz de reproduzir a movimentação articulatória visível da fala, incluindo os efeitos da coarticulação, a partir de uma base de apenas 34 imagens. O trabalho inclui a implementação de um sistema piloto integrado a conversor texto-fala. Adicionalmente, o método de síntese proposto é avaliado através de teste de inteligibilidade da fala. Os resultados desta avaliação indicam que a informação visual fornecida pelas animações geradas pelo sistema contribui para a inteligibilidade da fala em condições de áudio contaminado por ruído. Apesar do trabalho estar restrito ao Português do Brasil, a solução apresentada é aplicável a outras línguas. Palavras-chave: Computação Gráfica, Animação Facial, Visemas, Coarticulação, MetamorfoseAbstract: Speech synchronized facial animation allows the implementation of talking heads that potentially can improve human-computer interfaces making them more efficient and attractive. This work presentsan image based 2D facial animation synthesis method whose development was guided by two main goals: the realistic reproduction of visible speech articulatory movements, including coarticulation effects, and the possibility to implement the method also on limited processing and memory platforms, like mobile phones or personal digital assistants. The developed method is based on an image database of Brazilian Portuguese context dependent visemes and uses the morphing between visemes strategy as facial animation synthesis technique. The proposed approach represents an alternative and innovative synthesis strategy, capable of reproducing the visible speech articulatory movements, including coarticulation effects, from an image database of just 34 images. This work includes the implementation of a pilot system integrated to a text-to-speech synthesizer. Additionally, the proposed synthesis method is evaluated through a speech intelligibility test. The test results indicate that the animations generated by the system contribute to improve speech intelligibility when audio is degraded by noise. Despite the fact this work is restricted to Brazilian Portuguese, the presented solution is applicable to other languages. Keywords: Computer Graphics, Facial Animation, Visemes, Coarticulation, MorphingMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric

    Animação 2D de fala expressiva

    No full text
    Orientador: José Mario De MartinoTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: O desenvolvimento da tecnologia de animação facial busca atender uma demanda crescente por aplicações envolvendo assistentes, vendedores, tutores e apresentadores de notícias virtuais; personagens realistas de videogames, agentes sociais e ferramentas para experimentos científicos em psicologia e ciências comportamentais. Um aspecto relevante e desafiador no desenvolvimento de cabeças falantes, ou "talking heads", é a reprodução realista dos movimentos articulatórios da fala combinados aos elementos de comunicação não-verbal e de expressão de emoções. Este trabalho presenta uma metodologia de síntese de animação facial baseada em imagens, ou animação facial 2D, que permite a reprodução de uma ampla gama de estados emocionais de fala expressiva, além de suportar a modulação de movimentos da cabeça e o controle de elementos faciais tais como o piscar de olhos e o arqueamento de sobrancelhas. A síntese da animação utiliza uma base de imagens-protótipo que são processadas para obtenção dos quadros-chave da animação. Os pesos utilizados para a combinação das imagens-protótipo são derivados de um modelo estatístico de aparência e formas, construído a partir de um conjunto de imagens de treinamento extraídas de um corpus audiovisual de uma face real. A síntese das poses-chave é guiada pela transcrição fonética temporizada da fala a ser animada e pela informação do estado emocional almejado. As poses-chave representam visemas dependentes de contexto fonético que implicitamente modelam os efeitos da coarticulação na fala visual. A transição entre poses-chave adjacentes é realizada por um algoritmo de metamorfose não-linear entre imagens. As animações sintetizadas aplicando-se a metodologia proposta foram avaliadas por meio de avaliação perceptual de reconhecimento de emoções. Dentre as contribuições deste trabalho encontra-se a construção de uma base de dados de vídeo e captura de movimento para fala expressiva em português do BrasilAbstract: The facial animation technology experiences an increasing demand for applications involving virtual assistants, sellers, tutors and newscasters; lifelike game characters, social agents, and tools for scientific experiments in psychology and behavioral sciences. A relevant and challenging aspect of the development of talking heads is the realistic reproduction of the speech articulatory movements combined with the elements of non-verbal communication and the expression of emotions. This work presents an image-based, or 2D, facial animation synthesis methodology that allows the reproduction of a wide range of expressive speech emotional states and also supports the modulation of head movements and the control of face elements, like the blinking of the eyes and the raising of the eyebrows. The synthesis of the animation uses a database of prototype images which are combined to produce animation keyframes. The weights used for combining the prototype images are derived from a statistical active appearance model (AAM), which is built from a set of sample images extracted from an audio-visual corpus of a real face. The generation of the animation keyframes is driven by the timed phonetic transcription of the speech to be animated and the desired emotional state. The keyposes consist of expressive context-dependent visemes that implicitly model the speech coarticulation effects. The transition between adjacent keyposes is performed through a non-linear image morphing algorithm. To evaluate the synthesized animations, a perceptual evaluation based on the recognition of emotions was performed. Among the contributions of the work is also the building of a database of expressive speech video and motion capture data for Brazilian PortugueseDoutoradoEngenharia de ComputaçãoDoutora em Engenharia Elétric

    Sistema De Sìntese De Animação Facial Por Computador Baseada Na Manipulação De Imagens

    No full text
    Sistema de síntese de animação facial por computador baseada na manipulação de imagens. A invenção refere-se a um sistema de síntese de animação facial por computador sincronizada com a fala e videorealista. Neste sistema a animação facial é gerada através da seleção, manipulação e apresentação de um conjunto reduzido de imagens, representando visemas dependentes do contexto fonético, que reproduz a movimentação articulatória visível na face durante a produção da fala, incluindo os efeitos da coarticulação. A abordagem adotada permite que seja possível gerar animações com alto grau de videorealismo, mesmo em plataformas com reduzida capacidade de armazenamento, como os dispositivos móveis celulares. Entende-se como animação videorealista uma animação que pode ser confundida com um vídeo real.BRPI0903935 (A2)G06T17/00BR2009PI03935G06T17/0

    Indoor Temperatures in the 2018 Heat Wave in Quebec, Canada: Exploratory Study Using Ecobee Smart Thermostats

    No full text
    BackgroundClimate change, driven by human activity, is rapidly changing our environment and posing an increased risk to human health. Local governments must adapt their cities and prepare for increased periods of extreme heat and ensure that marginalized populations do not suffer detrimental health outcomes. Heat warnings traditionally rely on outdoor temperature data which may not reflect indoor temperatures experienced by individuals. Smart thermostats could be a novel and highly scalable data source for heat wave monitoring. ObjectiveThe objective of this study was to explore whether smart thermostats can be used to measure indoor temperature during a heat wave and identify houses experiencing indoor temperatures above 26°C. MethodsWe used secondary data—indoor temperature data recorded by ecobee smart thermostats during the Quebec heat waves of 2018 that claimed 66 lives, outdoor temperature data from Environment Canada weather stations, and indoor temperature data from 768 Quebec households. We performed descriptive statistical analyses to compare indoor temperatures differences between air conditioned and non–air conditioned houses in Montreal, Gatineau, and surrounding areas from June 1 to August 31, 2018. ResultsThere were significant differences in indoor temperature between houses with and without air conditioning on both heat wave and non–heat wave days (P<.001). Households without air conditioning consistently recorded daily temperatures above common indoor temperature standards. High indoor temperatures persisted for an average of 4 hours per day in non–air conditioned houses. ConclusionsOur findings were consistent with current literature on building warming and heat retention during heat waves, which contribute to increased risk of heat-related illnesses. Indoor temperatures can be captured continuously using smart thermostats across a large population. When integrated with local heat health action plans, these data could be used to strengthen existing heat alert response systems and enhance emergency medical service responses

    Analysis of Facial Expressions in Brazilian Sign Language (Libras)

    Get PDF
    Brazilian Sign Language (in Portuguese, Libras) is a visuospatial linguistic system adopted by the Brazilian deaf communities as the primary form of communication. Libras are a language of minority groups, thus their research and production of teaching materials do not receive the same incentive to progress or improve as oral languages. This complex language employs signs composed of forms and hands movements combined with facial expressions and postures of the body. Facial expressions rarely appear in sign language literature, despite their being essential to this form of communication. Thereby, this research objectives are to present and discuss sub-categories of the grammatical facial expressions of Libras, with two specific objectives: (1) the building of an annotated video corpus comprehending all the categories identified in the literature of facial expressions in Brazilian sign language; (2) the application of Facial Action Coding System (FACS) (which has its origins as an experimental model in psychology) as a tool for annotating facial expressions in sign language. Ruled by a qualitative approach, the video corpus was carried out with nineteen Libras users (sixteen deaf and three hearing participants) who translated forty- three phrases from Portuguese to Libras. The records were later transcribed with the Eudico Linguistic Annotator software tool. From the analysis of the literature review, it was observed the need to classify facial expression as subcategories of lexical, as intensity, homonyms, and norm. It is believed that it is necessary to expand the studies on facial expressions, favoring their documentation and the description of their linguistic functions. Advances in this sense can contribute to the learning of Libras by deaf students and also by listeners who propose to act as teachers or as translators and interpreters of this language system

    Signing avatars: making education more inclusive

    No full text
    In Brazil, there are approximately 9.7 million inhabitants who are deaf or hard of hearing. Moreover, about 30% of the Brazilian deaf community is illiterate in Brazilian Portuguese due to difficulties to offer deaf children an inclusive environment based on bilingual education. Currently, the prevailing teaching practice depends heavily on verbal language and on written material, making the inclusion of the deaf a challenging task. This paper presents the author's approach for tackling this problem and improving deaf students' accessibility to written material in order to help them master Brazilian Portuguese as a second language. We describe an ongoing project aimed at developing an automatic Brazilian Portuguese-to-Libras translation system that presents the translated content via an animated virtual human, or avatar. The paper describes the methodology adopted to compile a source language corpus having the deaf student needs in central focus. It also describes the construction of a parallel Brazilian Portuguese/Brazilian Sign Language (Libras) corpus based on motion capture technology. The envisioned translation architecture includes the definition of an Intermediate Language to drive the signing avatar. The results of a preliminary assessment of signs intelligibility highlight the application potential16793808CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO - CNPQCOORDENAÇÃO DE APERFEIÇOAMENTO DE PESSOAL DE NÍVEL SUPERIOR - CAPES458691/2013-588887.091672/2014-
    corecore