4 research outputs found
Recommended from our members
Speech coding
Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably
Study of vector quantization algorithms applied to speech signals
Orientador: Fernando José Von ZubenDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Este trabalho apresenta um estudo comparativo de três algoritmos de quantização vetorial, aplicados para a compressão de sinais de fala: k-médias, NG (do inglês Neural-Gas) e ARIA. Na técnica de compressão utilizada, os sinais são primeiramente parametrizados e quantizados, para serem armazenados e/ou transmitidos. Para recompor o sinal, os vetores quantizados são mapeados em quadros de fala, que são, por sua vez, concatenados, através de uma técnica de síntese concatenativa. Esse sistema pressupõe a existência de um dicionário (codebook) de vetores-padrão (codevectors), os quais são utilizados na etapa de codificação, e de um dicionário de quadros, que é utilizado na etapa de decodificação. Tais dicionários são gerados aplicando-se um algoritmo de quantização vetorial juntoa uma base de treinamento. Em particular, deseja-se avaliar o algoritmo imuno-inspirado denominado ARIA e sua capacidade de preservação da densidade da distribuição dos dados. São testados também diferentes conjuntos de parâmetros para identificar aquele que produz os melhores resultados. Por fim, são propostas modificações no algoritmo ARIA visando ganho de desempenho tanto na preservação de densidade quanto na qualidade do sinal sintetizadoAbstract: This work presents a comparative study of three algorithms for vector quantization, applied for the compression of speech signals: k-means, NG (Neural-Gas) and ARIA. In the compression technique used, the signals are first parameterized and quantized to be stored and/or transmitted. To reconstruct the signal, the quantized vectors are mapped into speech frames, which are concatenated through a concatenative synthesis technique. This system assumes the existence of a dictionary (codebook) of reference vectors (codevectors), which is used in the coding step, and a dictionary of frames, which is used in the decoding step. These dictionaries are generated by applying a vector quantization algorithm within a training database. In particular, we want to evaluate the immune-inspired algorithm called ARIA and its ability to preserve the density of data distribution. Different sets of parameters are also tested in order to identify the one that produces the best results. Finally, modifications to the ARIA algorithm are proposed aiming at obtaining gain in performance in both the preservation of density and the quality of the synthesized signalMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric