14 research outputs found
A Retrospective Assessment of Fuzzy Logic Applications in Voice Communications and Speech Analytics
Voice and speech communication is a major topic covering simultaneously ’communication’, ’control’ (because it often involves control in the coding algorithms), and ’computing’ - from speech analysis and recognition, to speech analytics and to speech coding over communication channels. While fuzzy logic was specifically con- ceived to deal with language and reasoning, it has yet a limited use in the referred field. We discuss some of the main current applications from the perspective of half a century since fuzzy logic inception
Internet Telephony : optimizing protocols, packet recovery, and packet size
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (leaves 104-106).by Grant Ho.M.Eng
Proceedings of the Mobile Satellite Conference
A satellite-based mobile communications system provides voice and data communications to mobile users over a vast geographic area. The technical and service characteristics of mobile satellite systems (MSSs) are presented and form an in-depth view of the current MSS status at the system and subsystem levels. Major emphasis is placed on developments, current and future, in the following critical MSS technology areas: vehicle antennas, networking, modulation and coding, speech compression, channel characterization, space segment technology and MSS experiments. Also, the mobile satellite communications needs of government agencies are addressed, as is the MSS potential to fulfill them
Perceptual models in speech quality assessment and coding
The ever-increasing demand for good communications/toll
quality speech has created a renewed interest into the
perceptual impact of rate compression. Two general areas are
investigated in this work, namely speech quality assessment
and speech coding.
In the field of speech quality assessment, a model is
developed which simulates the processing stages of the
peripheral auditory system. At the output of the model a
"running" auditory spectrum is obtained. This represents
the auditory (spectral) equivalent of any acoustic sound such
as speech. Auditory spectra from coded speech segments serve
as inputs to a second model. This model simulates the
information centre in the brain which performs the speech
quality assessment. [Continues.
Joint estimation of vocal tract and source parameters of a speech production model
This thesis describes algorithms developed to jointly estimate vocal tract shapes and source signals from real speech. The methodology was developed and evaluated using simple articulatory models of the vocal tract, coupled with lumped parametric models of the loss mechanisms in the tract.
The vocal tract is modelled by a five parameter area function model [Lm, 1990] Energy losses due to wall vibration and glottal resistance are modelled as a pole- zero filter placed at the glottis. A model described in [Lame, 1982] is used to approximate the lip radiation characteristic.
An articulatory-to-acoustic "linked codebook" of approximately 1600 shapes is generated and exhaustively searched to estimate the vocal tract parameters.
Glottal waveforms (input signals) are obtained by inverse filtering real speech using the estimated vocal tract parameters. The inverse filter is constructed using the estimated area function. A new method is proposed to fit the Liljencrants - Fant glottal flow model [Fant, Liljencrants and Lm, 1985] to the inverse filtered signals Estimates of the parameters are found from both the inverse filtered signal and its derivative.
The descnbed model successfully estimates articulatory parameters for artificial speech waveforms. Tests on recorded vowels suggest that the technique is applicable to real speech.
The technique has applications in the development of natural sounding speech synthesis, the treatment of speech disorders and the reduction of data bit rates in speech codin
A configurable vector processor for accelerating speech coding algorithms
The growing demand for voice-over-packer (VoIP) services and multimedia-rich
applications has made increasingly important the efficient, real-time implementation of
low-bit rates speech coders on embedded VLSI platforms. Such speech coders are
designed to substantially reduce the bandwidth requirements thus enabling dense multichannel
gateways in small form factor. This however comes at a high computational cost
which mandates the use of very high performance embedded processors.
This thesis investigates the potential acceleration of two major ITU-T speech coding
algorithms, namely G.729A and G.723.1, through their efficient implementation on a
configurable extensible vector embedded CPU architecture. New scalar and vector ISAs
were introduced which resulted in up to 80% reduction in the dynamic instruction count
of both workloads. These instructions were subsequently encapsulated into a parametric,
hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research
and implementation of the vector datapath of this vector coprocessor which is tightly-coupled
to a Sparc-V8 compliant CPU, the optimization and simulation methodologies
employed and the use of Electronic System Level (ESL) techniques to rapidly design
SIMD datapaths