Search CORE

1,026 research outputs found

Preemphasis Influence on Harmonic Speech Model with Autoregressive Parameterization

Author: Pribilova A.
Publication venue: Společnost pro radioelektronické inženýrství
Publication date: 01/09/2003
Field of study

Autoregressive speech parameterization with and without preemphasis is discussed for the source-filter model and the harmonic model. Quality of synthetic speech is compared for the harmonic speech model using autoregressive parameterization without preemphasis, with constant and adaptive preemphasis. Experimental results are evaluated by the RMS log spectral measure between the smoothed spectra of original and synthesized male, female, and childish speech sampled at 8 kHz and 16 kHz. Although the harmonic model is used, the benefit of the adaptive preemphasis could be valid for the source-filter model, as well

Directory of Open Access Journals

Digital library of Brno University of Technology

Bio-inspired broad-class phonetic labelling

Author: Fernández L.M.
Ferrández Vicente José Manuel
Gómez Vilda Pedro
Martínez Olalla Rafael
Muñoz Cristina
Rodellar Biarge M. Victoria
Álvarez Marquina Agustin
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2008
Field of study

Recent studies have shown that the correct labeling of phonetic classes may help current Automatic Speech Recognition (ASR) when combined with classical parsing automata based on Hidden Markov Models (HMM).Through the present paper a method for Phonetic Class Labeling (PCL) based on bio-inspired speech processing is described. The methodology is based in the automatic detection of formants and formant trajectories after a careful separation of the vocal and glottal components of speech and in the operation of CF (Characteristic Frequency) neurons in the cochlear nucleus and cortical complex of the human auditory apparatus. Examples of phonetic class labeling are given and the applicability of the method to Speech Processing is discussed

Archivo Digital UPM

RawNet: Fast End-to-End Neural Vocoder

Author: He Yunchao
Wang Yujun
Zhang Haitong
Publication venue
Publication date: 10/04/2019
Field of study

Neural networks based vocoders have recently demonstrated the powerful ability to synthesize high quality speech. These models usually generate samples by conditioning on some spectrum features, such as Mel-spectrum. However, these features are extracted by using speech analysis module including some processing based on the human knowledge. In this work, we proposed RawNet, a truly end-to-end neural vocoder, which use a coder network to learn the higher representation of signal, and an autoregressive voder network to generate speech sample by sample. The coder and voder together act like an auto-encoder network, and could be jointly trained directly on raw waveform without any human-designed features. The experiments on the Copy-Synthesis tasks show that RawNet can achieve the comparative synthesized speech quality with LPCNet, with a smaller model architecture and faster speech generation at the inference step.Comment: Submitted to Interspeech 2019, Graz, Austri

arXiv.org e-Print Archive