Search CORE

2 research outputs found

An agonist-antagonist pitch production model

Author: A Cutler
B Schuller
D Piovesan
DA Kistemaker
H Fujisaki
IR Titze
J Santen van
L Beranek
R Plamondon
RE Thomas
S Prom-on
T Vogt
V Zatsiorsky
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/10/2016
Field of study

Prosody is a phenomenon that is crucial for numerous fields of speech research, accenting the importance of having a robust prosody model. A class of intonation models based on the physiology of pitch pro- duction are especially attractive for their inherent multilingual support. These models rely on an accurate model of muscle activation. Tradi- tionally they have used the 2nd order spring-damper-mass (SDM) mus- cle model. However, recent research has shown that the SDM model is not sufficient for adequate modelling of the muscle dynamics. The 3rd order Hill type model offers a more accurate representation of mus- cle dynamics, but it has been shown to be underdamped when using physiologically plausible muscle parameters. In this paper we propose an agonist-antagonist pitch production (A2P2) model that both validates and gives insight behind the improved results of using higher-order crit- ically damped system models in intonation modelling

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Intonation modelling using a muscle model and perceptually weighted matching pursuit

Author: Garner Philip N.
Gerazov Branislav
Gjoreski Aleksandar
Honnet Pierre-Edouard
Publication venue: 'Elsevier BV'
Publication date: 19/12/2017
Field of study

We propose a physiologically based intonation model using perceptual relevance. Motivated by speech synthesis from a speech-to-speech translation (S2ST) point of view, we aim at a language independent way of modelling intonation. The model presented in this paper can be seen as a generalisation of the command response (CR) model, albeit with the same modelling power. It is an additive model which decomposes intonation contours into a sum of critically damped system impulse responses. To decompose the intonation contour, we use a weighted correlation based atom decomposition algorithm (WCAD) built around a matching pursuit framework. The algorithm allows for an arbitrary precision to be reached using an iterative procedure that adds more elementary atoms to the model. Experiments are presented demonstrating that this generalised CR (GCR) model is able to model intonation as would be expected. Experiments also show that the model produces a similar number of parameters or elements as the CR model. We conclude that the GCR model is appropriate as an engineering solution for modelling prosody, and hope that it is a contribution to a deeper scientific understanding of the neurobiological process of intonation

Infoscience - École polytechnique fédérale de Lausanne