193 research outputs found

    Singing voice resynthesis using concatenative-based techniques

    Get PDF
    Tese de Doutoramento. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 201

    Analysis on Using Synthesized Singing Techniques in Assistive Interfaces for Visually Impaired to Study Music

    Get PDF
    Tactile and auditory senses are the basic types of methods that visually impaired people sense the world. Their interaction with assistive technologies also focuses mainly on tactile and auditory interfaces. This research paper discuss about the validity of using most appropriate singing synthesizing techniques as a mediator in assistive technologies specifically built to address their music learning needs engaged with music scores and lyrics. Music scores with notations and lyrics are considered as the main mediators in musical communication channel which lies between a composer and a performer. Visually impaired music lovers have less opportunity to access this main mediator since most of them are in visual format. If we consider a music score, the vocal performer’s melody is married to all the pleasant sound producible in the form of singing. Singing best fits for a format in temporal domain compared to a tactile format in spatial domain. Therefore, conversion of existing visual format to a singing output will be the most appropriate nonlossy transition as proved by the initial research on adaptive music score trainer for visually impaired [1]. In order to extend the paths of this initial research, this study seek on existing singing synthesizing techniques and researches on auditory interfaces

    A review of differentiable digital signal processing for music and speech synthesis

    Get PDF
    The term “differentiable digital signal processing” describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music and speech synthesis. We catalogue applications to tasks including music performance rendering, sound matching, and voice transformation, discussing the motivations for and implications of the use of this methodology. This is accompanied by an overview of digital signal processing operations that have been implemented differentiably, which is further supported by a web book containing practical advice on differentiable synthesiser programming (https://intro2ddsp.github.io/). Finally, we highlight open challenges, including optimisation pathologies, robustness to real-world conditions, and design trade-offs, and discuss directions for future research

    Singing voice resynthesis using concatenative-based techniques

    Get PDF
    Dissertação submetida à Faculdade de Engenharia da Universidade do Porto para satisfação parcial dos requisitos do grau de doutor em Engenharia Informática.Singing has an important role in our life, and although synthesizers have been trying to replicate every musical instrument for decades, is was only during the last nine years that commercial singing synthesizers started to appear, allowing the ability to merge music and text, i.e., singing. These solutions may present realistic results on some situations, but they require time consuming processes and experienced users. The goal of this research work is to develop, create or adapt techniques that allow the resynthesis of the singing voice, i.e., allow the user to directly control a singing voice synthesizer using his/her own voice. The synthesizer should be able to replicate, as close as possible, the same melody, same phonetic sequence, and the same musical performance. Initially, some work was developed trying to resynthesize piano recordings with evolutionary approaches, using Genetic Algorithms, where a population of individuals (candidate solutions) representing a sequence of music notes evolved over time, tries to match an original audio stream. Later, the focus would return to the singing voice, exploring techniques as Hidden Markov Models, Neural Network Self Organized Maps, among others. Finally, a Concatenative Unit Selection approach was chosen as the core of a singing voice resynthesis system. By extracting energy, pitch and phonetic information (MFCC, LPC), and using it within a phonetic similarity Viterbi-based Unit Selection System, a sequence of internal sound library frames is chosen to replicate the original audio performance. Although audio artifacts still exist, preventing its use on professional applications, the concept of a new audio tool was created, that presents high potential for future work, not only in singing voice, but in other musical or speech domains.This dissertation had the kind support of FCT (Portuguese Foundation for Science and Technology, an agency of the Portuguese Ministry for Science, Technology and Higher Education) under grant SFRH / BD / 30300 / 2006, and has been articulated with research project PTDC/SAU-BEB/104995/2008 (Assistive Real-Time Technology in Singing) whose objectives include the development of interactive technologies helping the teaching and learning of singing

    Speech Synthesis Based on Hidden Markov Models

    Get PDF
    corecore