14 research outputs found
Optimisation techniques for low bit rate speech coding
This thesis extends the background theory of speech and major speech coding schemes used in existing networks to an implementation of GSM full-rate speech compression on a RISC DSP and a multirate application for speech coding. Speech coding is the field concerned with obtaining compact digital representations of speech signals for the purpose of efficient transmission. In this thesis, the background of speech compression, characteristics of speech signals and the DSP algorithms used have been examined. The current speech coding schemes and requirements have been studied. The Global System for Mobile communication (GSM) is a digital mobile radio system which is extensively used throughout Europe, and also in many other parts of the world. The algorithm is standardised by the European Telecommunications Standardisation histitute (ETSI). The full-rate and half-rate speech compression of GSM have been analysed. A real time implementation of the full-rate algorithm has been carried out on a RISC processor GEPARD by Austria Mikro Systeme International (AMS). The GEPARD code has been tested with all of the test sequences provided by ETSI and the results are bit-exact. The transcoding delay is lower than the ETSI requirement. A comparison of the half-rate and full-rate compression algorithms is discussed. Both algorithms offer near toll speech quality comparable or better than analogue cellular networks. The half-rate compression requires more computationally intensive operations and therefore a more powerful processor will be needed due to the complexity of the code. Hence the cost of the implementation of half-rate codec will be considerably higher than full-rate. A description of multirate signal processing and its application on speech (SBC) and speech/audio (MPEG) has been given. An investigation into the possibility of combining multirate filtering and GSM fill-rate speech algorithm. The results showed that multirate signal processing cannot be directly applied GSM full-rate speech compression since this method requires more processing power, causing longer coding delay but did not appreciably improve the bit rate. In order to achieve a lower bit rate, the GSM full-rate mathematical algorithm can be used instead of the standardised ETSI recommendation. Some changes including the number of quantisation bits has to be made before the application of multirate signal processing and a new standard will be required
Proceedings of the Second International Mobile Satellite Conference (IMSC 1990)
Presented here are the proceedings of the Second International Mobile Satellite Conference (IMSC), held June 17-20, 1990 in Ottawa, Canada. Topics covered include future mobile satellite communications concepts, aeronautical applications, modulation and coding, propagation and experimental systems, mobile terminal equipment, network architecture and control, regulatory and policy considerations, vehicle antennas, and speech compression
A Parametric Approach for Efficient Speech Storage, Flexible Synthesis and Voice Conversion
During the past decades, many areas of speech processing have benefited from the vast increases in the available memory sizes and processing power. For example, speech recognizers can be trained with enormous speech databases and high-quality speech synthesizers can generate new speech sentences by concatenating speech units retrieved from a large inventory of speech data. However, even in today's world of ever-increasing memory sizes and computational resources, there are still lots of embedded application scenarios for speech processing techniques where the memory capacities and the processor speeds are very limited. Thus, there is still a clear demand for solutions that can operate with limited resources, e.g., on low-end mobile devices.
This thesis introduces a new segmental parametric speech codec referred to as the VLBR codec. The novel proprietary sinusoidal speech codec designed for efficient speech storage is capable of achieving relatively good speech quality at compression ratios beyond the ones offered by the standardized speech coding solutions, i.e., at bitrates of approximately 1 kbps and below. The efficiency of the proposed coding approach is based on model simplifications, mode-based segmental processing, and the method of adaptive downsampling and quantization. The coding efficiency is also further improved using a novel flexible multi-mode matrix quantizer structure and enhanced dynamic codebook reordering. The compression is also facilitated using a new perceptual irrelevancy removal method.
The VLBR codec is also applied to text-to-speech synthesis. In particular, the codec is utilized for the compression of unit selection databases and for the parametric concatenation of speech units. It is also shown that the efficiency of the database compression can be further enhanced using speaker-specific retraining of the codec. Moreover, the computational load is significantly decreased using a new compression-motivated scheme for very fast and memory-efficient calculation of concatenation costs, based on techniques and implementations used in the VLBR codec.
Finally, the VLBR codec and the related speech synthesis techniques are complemented with voice conversion methods that allow modifying the perceived speaker identity which in turn enables, e.g., cost-efficient creation of new text-to-speech voices. The VLBR-based voice conversion system combines compression with the popular Gaussian mixture model based conversion approach. Furthermore, a novel method is proposed for converting the prosodic aspects of speech. The performance of the VLBR-based voice conversion system is also enhanced using a new approach for mode selection and through explicit control of the degree of voicing.
The solutions proposed in the thesis together form a complete system that can be utilized in different ways and configurations. The VLBR codec itself can be utilized, e.g., for efficient compression of audio books, and the speech synthesis related methods can be used for reducing the footprint and the computational load of concatenative text-to-speech synthesizers to levels required in some embedded applications. The VLBR-based voice conversion techniques can be used to complement the codec both in storage applications and in connection with speech synthesis. It is also possible to only utilize the voice conversion functionality, e.g., in games or other entertainment applications
A configurable vector processor for accelerating speech coding algorithms
The growing demand for voice-over-packer (VoIP) services and multimedia-rich
applications has made increasingly important the efficient, real-time implementation of
low-bit rates speech coders on embedded VLSI platforms. Such speech coders are
designed to substantially reduce the bandwidth requirements thus enabling dense multichannel
gateways in small form factor. This however comes at a high computational cost
which mandates the use of very high performance embedded processors.
This thesis investigates the potential acceleration of two major ITU-T speech coding
algorithms, namely G.729A and G.723.1, through their efficient implementation on a
configurable extensible vector embedded CPU architecture. New scalar and vector ISAs
were introduced which resulted in up to 80% reduction in the dynamic instruction count
of both workloads. These instructions were subsequently encapsulated into a parametric,
hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research
and implementation of the vector datapath of this vector coprocessor which is tightly-coupled
to a Sparc-V8 compliant CPU, the optimization and simulation methodologies
employed and the use of Electronic System Level (ESL) techniques to rapidly design
SIMD datapaths
Proceedings of the Third International Mobile Satellite Conference (IMSC 1993)
Satellite-based mobile communications systems provide voice and data communications to users over a vast geographic area. The users may communicate via mobile or hand-held terminals, which may also provide access to terrestrial cellular communications services. While the first and second International Mobile Satellite Conferences (IMSC) mostly concentrated on technical advances, this Third IMSC also focuses on the increasing worldwide commercial activities in Mobile Satellite Services. Because of the large service areas provided by such systems, it is important to consider political and regulatory issues in addition to technical and user requirements issues. Topics covered include: the direct broadcast of audio programming from satellites; spacecraft technology; regulatory and policy considerations; advanced system concepts and analysis; propagation; and user requirements and applications
Quality aspects of Internet telephony
Internet telephony has had a tremendous impact on how people communicate.
Many now maintain contact using some form of Internet telephony.
Therefore the motivation for this work has been to address the quality aspects
of real-world Internet telephony for both fixed and wireless telecommunication.
The focus has been on the quality aspects of voice communication,
since poor quality leads often to user dissatisfaction. The scope of the work
has been broad in order to address the main factors within IP-based voice
communication.
The first four chapters of this dissertation constitute the background
material. The first chapter outlines where Internet telephony is deployed
today. It also motivates the topics and techniques used in this research.
The second chapter provides the background on Internet telephony including
signalling, speech coding and voice Internetworking. The third chapter
focuses solely on quality measures for packetised voice systems and finally
the fourth chapter is devoted to the history of voice research.
The appendix of this dissertation constitutes the research contributions.
It includes an examination of the access network, focusing on how calls are
multiplexed in wired and wireless systems. Subsequently in the wireless
case, we consider how to handover calls from 802.11 networks to the cellular
infrastructure. We then consider the Internet backbone where most of our
work is devoted to measurements specifically for Internet telephony. The
applications of these measurements have been estimating telephony arrival
processes, measuring call quality, and quantifying the trend in Internet telephony
quality over several years. We also consider the end systems, since
they are responsible for reconstructing a voice stream given loss and delay
constraints. Finally we estimate voice quality using the ITU proposal PESQ
and the packet loss process.
The main contribution of this work is a systematic examination of Internet
telephony. We describe several methods to enable adaptable solutions
for maintaining consistent voice quality. We have also found that relatively
small technical changes can lead to substantial user quality improvements.
A second contribution of this work is a suite of software tools designed to
ascertain voice quality in IP networks. Some of these tools are in use within
commercial systems today
Effect of water on electrical properties of Refined, Bleached, and Deodorized Palm Oil (RBDPO) as electrical insulating material
This paper describes the properties of refined, bleached, deodorized palm oil (RBDPO) as having the potential to be used as insulating liquid. There are several important properties such as electrical breakdown, dielectric dissipation factor, specific gravity, flash point, viscosity and pour point of RBDPO that was measured and compared to commercial mineral oil which is largely in current use as insulating liquid in power transformers. Experimental results of the electrical properties revealed that the average breakdown voltage of the RBDPO sample, without the addition of water at room temperature, is 13.368 kV. The result also revealed that due to effect of water, the breakdown voltage is lower than that of commercial mineral oil (Hyrax). However, the flash point and the pour point of RBDPO is very high compared to mineral oil thus giving it advantageous possibility to be used safely as insulating liquid. The results showed that RBDPO is greatly influenced by water, causing the breakdown voltage to decrease and the dissipation factor to increase; this is attributable to the high amounts of dissolved water
Cumulative index to NASA Tech Briefs, 1986-1990, volumes 10-14
Tech Briefs are short announcements of new technology derived from the R&D activities of the National Aeronautics and Space Administration. These briefs emphasize information considered likely to be transferrable across industrial, regional, or disciplinary lines and are issued to encourage commercial application. This cumulative index of Tech Briefs contains abstracts and four indexes (subject, personal author, originating center, and Tech Brief number) and covers the period 1986 to 1990. The abstract section is organized by the following subject categories: electronic components and circuits, electronic systems, physical sciences, materials, computer programs, life sciences, mechanics, machinery, fabrication technology, and mathematics and information sciences