Search CORE

363 research outputs found

Constructing a Deep Neural Network based Spectral Model for Statistical Speech Synthesis

Author: G. E. Hinton
Geoffrey E. Hinton
H Kawahara
H Zen
Z-H Ling
Publication venue
Publication date: 17/06/2015
Field of study

Crossref

Edinburgh Research Explorer

A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis

Author: Takaki Shinji
Yamagishi Junichi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2016
Field of study

Edinburgh Research Explorer

A Function-wise Pre-training Technique for Constructing a Deep Neural Network based Spectral Model in Statistical Parametric Speech Synthesis

Author: Takaki Shinji
Wu Z.
Yamagishi Junichi
Publication venue
Publication date: 01/09/2015
Field of study

Edinburgh Research Explorer

Uncertainty quantification for physics-informed deep learning

Author: Brune Christoph
Guo Mengwu
Publication venue
Publication date: 01/11/2021
Field of study

University of Twente Research Information

Intelligent facial animation: Creating emphatic characters with stimuli based animation

Author: José Mário Figueiredo Serra
Publication venue
Publication date: 11/12/2017
Field of study

Repositório Aberto da Universidade do Porto

Deep Gaussian Processes: Advances in Models and Inference

Author: Salimbeni Hugh
Publication venue: Computing, Imperial College London
Publication date: 01/07/2020
Field of study

Hierarchical models are certainly in fashion these days. It seems difficult to navigate the field of machine learning without encountering `deep' models of one sort or another. The popularity of the deep learning revolution has been driven by some striking empirical successes, prompting both intense rapture and intense criticism. The criticisms often centre around the lack of model uncertainty, leading to sometimes drastically overconfident predictions. Others point to the lack of a mechanism for incorporating prior knowledge, and the reliance on large datasets. A widely held hope is that a Bayesian approach might overcome these problems. The deep Gaussian process presents a paradigm for building deep models from a Bayesian perspective. A Gaussian process is a prior for functions. A deep Gaussian process uses several Gaussian process functions and combines them hierarchically through composition (that is, the output of one is the input to the next). The deep Gaussian process promises to capture the compositional nature of deep learning while mitigating some of the disadvantages through a Bayesian approach. The thesis develops deep Gaussian process modelling in a number of ways. The model is first interpreted differently from previous work, not as a `hierarchical prior' but as a factorized prior with an hierarchical likelihood. Mean functions are suggested to avoid issues of degeneracy and to aid initialization. The main contribution is a new method of inference that avoids the burden of representing the function values directly through an application of sparse variational inference. This method scales to arbitrarily large data and is shown to work well in practice through experiments. The use of variational inference recasts (approximate) inference as optimization of Gaussian distributions. This optimization has an exploitable geometry via the natural gradient. The natural gradient is shown to be advantageous for single layer non-conjugate models, and for the (final layer of a) deep Gaussian process model. Deep Gaussian processes can be a model both for complex associations between variables and complex marginal distributions of single variables. Incorporating noise in the hierarchy leads to complex marginal distribution through the non-linearities of the mappings at each layer. The inference required for noisy variables cannot be handled with sparse methods, as sparse methods rely on correlations between variables, which are absent for noisy variables. Instead, a more direct approach is developed, using an importance weighted variational scheme.Open Acces

Spiral - Imperial College Digital Repository

Modeling variation of human motion

Author: Du Han
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2022
Field of study

The synthesis of realistic human motion with large variations and different styles has a growing interest in simulation applications such as the game industry, psychological experiments, and ergonomic analysis. The statistical generative models are used by motion controllers in our motion synthesis framework to create new animations for different scenarios. Data-driven motion synthesis approaches are powerful tools for producing high-fidelity character animations. With the development of motion capture technologies, more and more motion data are publicly available now. However, how to efficiently reuse a large amount of motion data to create new motions for arbitrary scenarios poses challenges, especially for unsupervised motion synthesis. This thesis presents a series of works that analyze and model the variations of human motion data. The goal is to learn statistical generative models to create any number of new human animations with rich variations and styles. The work of the thesis will be presented in three main chapters. We first explore how variation is represented in motion data. Learning a compact latent space that can expressively contain motion variation is essential for modeling motion data. We propose a novel motion latent space learning approach that can intrinsically tackle the spatialtemporal properties of motion data. Secondly, we present our Morphable Graph framework for human motion modeling and synthesis for assembly workshop scenarios. A series of studies have been conducted to apply statistical motion modeling and synthesis approaches for complex assembly workshop use cases. Learning the distribution of motion data can provide a compact representation of motion variations and convert motion synthesis tasks to optimization problems. Finally, we show how the style variations of human activities can be modeled with a limited number of examples. Natural human movements display a rich repertoire of styles and personalities. However, it is difficult to get enough examples for data-driven approaches. We propose a conditional variational autoencoder (CVAE) to combine large variations in the neutral motion database and style information from a limited number of examples.Die Synthese realistischer menschlicher Bewegungen mit großen Variationen und unterschiedlichen Stilen ist für Simulationsanwendungen wie die Spieleindustrie, psychologische Experimente und ergonomische Analysen von wachsendem Interesse. Datengetriebene Bewegungssyntheseansätze sind leistungsstarke Werkzeuge für die Erstellung realitätsgetreuer Charakteranimationen. Mit der Entwicklung von Motion-Capture-Technologien sind nun immer mehr Motion-Daten öffentlich verfügbar. Die effiziente Wiederverwendung einer großen Menge von Motion-Daten zur Erstellung neuer Bewegungen für beliebige Szenarien stellt jedoch eine Herausforderung dar, insbesondere für die unüberwachte Bewegungssynthesemethoden. Das Lernen der Verteilung von Motion-Daten kann eine kompakte Repräsentation von Bewegungsvariationen liefern und Bewegungssyntheseaufgaben in Optimierungsprobleme umwandeln. In dieser Dissertation werden eine Reihe von Arbeiten vorgestellt, die die Variationen menschlicher Bewegungsdaten analysieren und modellieren. Das Ziel ist es, statistische generative Modelle zu erlernen, um eine beliebige Anzahl neuer menschlicher Animationen mit reichen Variationen und Stilen zu erstellen. In unserem Bewegungssynthese-Framework werden die statistischen generativen Modelle von Bewegungscontrollern verwendet, um neue Animationen für verschiedene Szenarien zu erstellen. Die Arbeit in dieser Dissertation wird in drei Hauptkapiteln vorgestellt. Wir untersuchen zunächst, wie Variation in Bewegungsdaten dargestellt wird. Das Erlernen eines kompakten latenten Raums, der Bewegungsvariationen ausdrucksvoll enthalten kann, ist für die Modellierung von Bewegungsdaten unerlässlich. Wir schlagen einen neuartigen Ansatz zum Lernen des latenten Bewegungsraums vor, der die räumlich-zeitlichen Eigenschaften von Bewegungsdaten intrinsisch angehen kann. Zweitens stellen wir unser Morphable Graph Framework für die menschliche Bewegungsmodellierung und -synthese für Montage-Workshop- Szenarien vor. Es wurde eine Reihe von Studien durchgeführt, um statistische Bewegungsmodellierungs und syntheseansätze für komplexe Anwendungsfälle in Montagewerkstätten anzuwenden. Schließlich zeigen wir anhand einer begrenzten Anzahl von Beispielen, wie die Stilvariationen menschlicher Aktivitäten modelliertwerden können. Natürliche menschliche Bewegungen weisen ein reiches Repertoire an Stilen und Persönlichkeiten auf. Es ist jedoch schwierig, genügend Beispiele für datengetriebene Ansätze zu erhalten. Wir schlagen einen Conditional Variational Autoencoder (CVAE) vor, um große Variationen in der neutralen Bewegungsdatenbank und Stilinformationen aus einer begrenzten Anzahl von Beispielen zu kombinieren. Wir zeigen, dass unser Ansatz eine beliebige Anzahl von natürlich aussehenden Variationen menschlicher Bewegungen mit einem ähnlichen Stil wie das Ziel erzeugen kann

Universaar

Acronym

Expressive movement generation with machine learning

Author: Alemi Omid
Publication venue
Publication date: 25/03/2021
Field of study

Movement is an essential aspect of our lives. Not only do we move to interact with our physical environment, but we also express ourselves and communicate with others through our movements. In an increasingly computerized world where various technologies and devices surround us, our movements are essential parts of our interaction with and consumption of computational devices and artifacts. In this context, incorporating an understanding of our movements within the design of the technologies surrounding us can significantly improve our daily experiences. This need has given rise to the field of movement computing – developing computational models of movement that can perceive, manipulate, and generate movements. In this thesis, we contribute to the field of movement computing by building machine-learning-based solutions for automatic movement generation. In particular, we focus on using machine learning techniques and motion capture data to create controllable, generative movement models. We also contribute to the field by creating datasets, tools, and libraries that we have developed during our research. We start our research by reviewing the works on building automatic movement generation systems using machine learning techniques and motion capture data. Our review covers background topics such as high-level movement characterization, training data, features representation, machine learning models, and evaluation methods. Building on our literature review, we present WalkNet, an interactive agent walking movement controller based on neural networks. The expressivity of virtual, animated agents plays an essential role in their believability. Therefore, WalkNet integrates controlling the expressive qualities of movement with the goal-oriented behaviour of an animated virtual agent. It allows us to control the generation based on the valence and arousal levels of affect, the movement’s walking direction, and the mover’s movement signature in real-time. Following WalkNet, we look at controlling movement generation using more complex stimuli such as music represented by audio signals (i.e., non-symbolic music). Music-driven dance generation involves a highly non-linear mapping between temporally dense stimuli (i.e., the audio signal) and movements, which renders a more challenging modelling movement problem. To this end, we present GrooveNet, a real-time machine learning model for music-driven dance generation

Simon Fraser University Institutional Repository