4 research outputs found

    36th International Symposium on Theoretical Aspects of Computer Science: STACS 2019, March 13-16, 2019, Berlin, Germany

    Get PDF

    Corpus-based unit selection for natural-sounding speech synthesis

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.Includes bibliographical references (p. 179-196).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Speech synthesis is an automatic encoding process carried out by machine through which symbols conveying linguistic information are converted into an acoustic waveform. In the past decade or so, a recent trend toward a non-parametric, corpus-based approach has focused on using real human speech as source material for producing novel natural-sounding speech. This work proposes a communication-theoretic formulation in which unit selection is a noisy channel through which an input sequence of symbols passes and an output sequence, possibly corrupted due to the coverage limits of the corpus, emerges. The penalty of approximation is quantified by substitution and concatenation costs which grade what unit contexts are interchangeable and where concatenations are not perceivable. These costs are semi-automatically derived from data and are found to agree with acoustic-phonetic knowledge. The implementation is based on a finite-state transducer (FST) representation that has been successfully used in speech and language processing applications including speech recognition. A proposed constraint kernel topology connects all units in the corpus with associated substitution and concatenation costs and enables an efficient Viterbi search that operates with low latency and scales to large corpora. An A* search can be applied in a second, rescoring pass to incorporate finer acoustic modelling. Extensions to this FST-based search include hierarchical and paralinguistic modelling. The search can also be used in an iterative feedback loop to record new utterances to enhance corpus coverage. This speech synthesis framework has been deployed across various domains and languages in many voices, a testament to its flexibility and rapid prototyping capability.(cont.) Experimental subjects completing tasks in a given air travel planning scenario by interacting in real time with a spoken dialogue system over the telephone have found the system "easiest to understand" out of eight competing systems. In more detailed listening evaluations, subjective opinions garnered from human participants are found to be correlated with objective measures calculable by machine.by Jon Rong-Wei Yi.Ph.D

    Evolution from the ground up with Amee – From basic concepts to explorative modeling

    Get PDF
    Evolutionary theory has been the foundation of biological research for about a century now, yet over the past few decades, new discoveries and theoretical advances have rapidly transformed our understanding of the evolutionary process. Foremost among them are evolutionary developmental biology, epigenetic inheritance, and various forms of evolu- tionarily relevant phenotypic plasticity, as well as cultural evolution, which ultimately led to the conceptualization of an extended evolutionary synthesis. Starting from abstract principles rooted in complexity theory, this thesis aims to provide a unified conceptual understanding of any kind of evolution, biological or otherwise. This is used in the second part to develop Amee, an agent-based model that unifies development, niche construction, and phenotypic plasticity with natural selection based on a simulated ecology. Amee is implemented in Utopia, which allows performant, integrated implementation and simulation of arbitrary agent-based models. A phenomenological overview over Amee’s capabilities is provided, ranging from the evolution of ecospecies down to the evolution of metabolic networks and up to beyond-species-level biological organization, all of which emerges autonomously from the basic dynamics. The interaction of development, plasticity, and niche construction has been investigated, and it has been shown that while expected natural phenomena can, in principle, arise, the accessible simulation time and system size are too small to produce natural evo-devo phenomena and –structures. Amee thus can be used to simulate the evolution of a wide variety of processes
    corecore