5 research outputs found
On the Use of Automatically Generated Discourse-Level Information in a Concept-to-Speech Synthesis System
This paper describes the latest version of the SOLE concept-to-speech system, which uses linguistic information provided by a natural language generation system to improve the prosody of synthetic speech. We discuss the types of linguistic information that prove most useful and the implications for text-to-speech systems
Speech synthesis, Speech simulation and speech science
Speech synthesis research has been transformed in recent years through the exploitation of speech corpora - both for statistical modelling and as a source of signals for concatenative synthesis. This revolution in methodology and the new techniques it brings calls into question the received wisdom that better computer voice output will come from a better understanding of how humans produce speech. This paper discusses the relationship between this new technology of simulated speech and the traditional aims of speech science. The paper suggests that the goal of speech simulation frees engineers from inadequate linguistic and physiological descriptions of speech. But at the same time, it leaves speech scientists free to return to their proper goal of building a computational model of human speech production
Developing an enriched natural language grammar for prosodically-improved concent-to-speech synthesis
The need for interacting with machines using spoken natural language is growing,
along with the expectation that synthetic speech in this context sound
natural. Such interaction includes answering questions, where prosody plays an
important role in producing natural English synthetic speech by communicating
the information structure of utterances.
CCG is a theoretical framework that exploits the notion that, in English, information
structure, prosodic structure and syntactic structure are isomorphic.
This provides a way to convert a semantic representation of an utterance into
a prosodically natural spoken utterance. GF is a framework for writing grammars,
where abstract tree structures capture the semantic structure and concrete
grammars render these structures in linearised strings. This research combines
these frameworks to develop a system that converts semantic representations
of utterances into linearised strings of natural language that are marked up to
inform the prosody-generating component of a speech synthesis system.ComputingM. Sc. (Computing