2,590 research outputs found

    Speech Synthesis Based on Hidden Markov Models

    Get PDF

    PRESENCE: A human-inspired architecture for speech-based human-machine interaction

    No full text
    Recent years have seen steady improvements in the quality and performance of speech-based human-machine interaction driven by a significant convergence in the methods and techniques employed. However, the quantity of training data required to improve state-of-the-art systems seems to be growing exponentially and performance appears to be asymptotic to a level that may be inadequate for many real-world applications. This suggests that there may be a fundamental flaw in the underlying architecture of contemporary systems, as well as a failure to capitalize on the combinatorial properties of human spoken language. This paper addresses these issues and presents a novel architecture for speech-based human-machine interaction inspired by recent findings in the neurobiology of living systems. Called PRESENCE-"PREdictive SENsorimotor Control and Emulation" - this new architecture blurs the distinction between the core components of a traditional spoken language dialogue system and instead focuses on a recursive hierarchical feedback control structure. Cooperative and communicative behavior emerges as a by-product of an architecture that is founded on a model of interaction in which the system has in mind the needs and intentions of a user and a user has in mind the needs and intentions of the system

    A Multi-scale View of the Emergent Complexity of Life: A Free-energy Proposal

    Get PDF
    We review some of the main implications of the free-energy principle (FEP) for the study of the self-organization of living systems – and how the FEP can help us to understand (and model) biotic self-organization across the many temporal and spatial scales over which life exists. In order to maintain its integrity as a bounded system, any biological system - from single cells to complex organisms and societies - has to limit the disorder or dispersion (i.e., the long-run entropy) of its constituent states. We review how this can be achieved by living systems that minimize their variational free energy. Variational free energy is an information theoretic construct, originally introduced into theoretical neuroscience and biology to explain perception, action, and learning. It has since been extended to explain the evolution, development, form, and function of entire organisms, providing a principled model of biotic self-organization and autopoiesis. It has provided insights into biological systems across spatiotemporal scales, ranging from microscales (e.g., sub- and multicellular dynamics), to intermediate scales (e.g., groups of interacting animals and culture), through to macroscale phenomena (the evolution of entire species). A crucial corollary of the FEP is that an organism just is (i.e., embodies or entails) an implicit model of its environment. As such, organisms come to embody causal relationships of their ecological niche, which, in turn, is influenced by their resulting behaviors. Crucially, free-energy minimization can be shown to be equivalent to the maximization of Bayesian model evidence. This allows us to cast natural selection in terms of Bayesian model selection, providing a robust theoretical account of how organisms come to match or accommodate the spatiotemporal complexity of their surrounding niche. In line with the theme of this volume; namely, biological complexity and self-organization, this chapter will examine a variational approach to self-organization across multiple dynamical scales

    Latent tree models

    Full text link
    Latent tree models are graphical models defined on trees, in which only a subset of variables is observed. They were first discussed by Judea Pearl as tree-decomposable distributions to generalise star-decomposable distributions such as the latent class model. Latent tree models, or their submodels, are widely used in: phylogenetic analysis, network tomography, computer vision, causal modeling, and data clustering. They also contain other well-known classes of models like hidden Markov models, Brownian motion tree model, the Ising model on a tree, and many popular models used in phylogenetics. This article offers a concise introduction to the theory of latent tree models. We emphasise the role of tree metrics in the structural description of this model class, in designing learning algorithms, and in understanding fundamental limits of what and when can be learned

    17 ways to say yes:Toward nuanced tone of voice in AAC and speech technology

    Get PDF
    People with complex communication needs who use speech-generating devices have very little expressive control over their tone of voice. Despite its importance in human interaction, the issue of tone of voice remains all but absent from AAC research and development however. In this paper, we describe three interdisciplinary projects, past, present and future: The critical design collection Six Speaking Chairs has provoked deeper discussion and inspired a social model of tone of voice; the speculative concept Speech Hedge illustrates challenges and opportunities in designing more expressive user interfaces; the pilot project Tonetable could enable participatory research and seed a research network around tone of voice. We speculate that more radical interactions might expand frontiers of AAC and disrupt speech technology as a whole

    Symbol Emergence in Robotics: A Survey

    Full text link
    Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory--motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER.Comment: submitted to Advanced Robotic
    corecore