5 research outputs found
Multilingual video dubbing—a technology review and current challenges
The proliferation of multi-lingual content on today’s streaming services has created a need for automated multi-lingual dubbing tools. In this article, current state-of-the-art approaches are discussed with reference to recent works in automatic dubbing and the closely related field of talking head generation. A taxonomy of papers within both fields is presented, and the main challenges of both speech-driven automatic dubbing, and talking head generation are discussed and outlined, together with proposals for future research to tackle these issues
Synthetic Speaking Children -- Why We Need Them and How to Make Them
Contemporary Human Computer Interaction (HCI) research relies primarily on
neural network models for machine vision and speech understanding of a system
user. Such models require extensively annotated training datasets for optimal
performance and when building interfaces for users from a vulnerable population
such as young children, GDPR introduces significant complexities in data
collection, management, and processing. Motivated by the training needs of an
Edge AI smart toy platform this research explores the latest advances in
generative neural technologies and provides a working proof of concept of a
controllable data generation pipeline for speech driven facial training data at
scale. In this context, we demonstrate how StyleGAN2 can be finetuned to create
a gender balanced dataset of children's faces. This dataset includes a variety
of controllable factors such as facial expressions, age variations, facial
poses, and even speech-driven animations with realistic lip synchronization. By
combining generative text to speech models for child voice synthesis and a 3D
landmark based talking heads pipeline, we can generate highly realistic,
entirely synthetic, talking child video clips. These video clips can provide
valuable, and controllable, synthetic training data for neural network models,
bridging the gap when real data is scarce or restricted due to privacy
regulations.Comment: Presented at SpeD 2
Corporate Governance in Emerging Economies: The Case of Romania
In Romania corporate governance has emerged beginning
with the early 2000s. The delay is explainable by the difficult steps taken
on the line of political, legal, economic and social reform. In recent years,
however, the corporate governance environment in Romania has changed.
Transparency and accountability have become key factors not only for
shareholders, but also for investors, buyers, suppliers, and other
stakeholders.
In this context, it is worth to consider, based on statistical data, the
degree of development of corporate governance in Romania. The selected
indicators are linked to attributes of the Board of directors, in particular
Board structure, size, independence, frequency of meetings, and other
factors. The sources used are based on the official data published by
companies listed on the Bucharest Stock Exchange (BSE). The results will
be compared with results of other case studies of emerging countries and
the European best practice
Corporate Governance in Emerging Economies: The Case of Romania
In Romania corporate governance has emerged beginning with the early 2000s. The delay is explainable by the difficult steps taken on the line of political, legal, economic and social reform. In recent years, however, the corporate governance environment in Romania has changed. Transparency and accountability have become key factors not only for shareholders, but also for investors, buyers, suppliers, and other stakeholders. In this context, it is worth to consider, based on statistical data, the degree of development of corporate governance in Romania. The selected indicators are linked to attributes of the Board of directors, in particular Board structure, size, independence, frequency of meetings, and other factors. The sources used are based on the official data published by companies listed on the Bucharest Stock Exchange (BSE). The results will be compared with results of other case studies of emerging countries and the European best practice.corporate governance; emerging economies; Romania; disclosure; corporate governance indicators.
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Taking inspiration from recent developments in visual generative tasks using
diffusion models, we propose a method for end-to-end speech-driven video
editing using a denoising diffusion model. Given a video of a talking person,
and a separate auditory speech recording, the lip and jaw motions are
re-synchronized without relying on intermediate structural representations such
as facial landmarks or a 3D face model. We show this is possible by
conditioning a denoising diffusion model on audio mel spectral features to
generate synchronised facial motion. Proof of concept results are demonstrated
on both single-speaker and multi-speaker video editing, providing a baseline
model on the CREMA-D audiovisual data set. To the best of our knowledge, this
is the first work to demonstrate and validate the feasibility of applying
end-to-end denoising diffusion models to the task of audio-driven video
editing.Comment: 8 Pages, code and project page available here:
https://danbigioi.github.io/DiffusionVideoEditing