113 research outputs found

    Improved compactly computable objective measures for predicting the acceptiability of speech communications systems

    Get PDF
    Issued as Monthly status reports [1-7], and Final report, Project no. E-21-61

    Perceptual models in speech quality assessment and coding

    Get PDF
    The ever-increasing demand for good communications/toll quality speech has created a renewed interest into the perceptual impact of rate compression. Two general areas are investigated in this work, namely speech quality assessment and speech coding. In the field of speech quality assessment, a model is developed which simulates the processing stages of the peripheral auditory system. At the output of the model a "running" auditory spectrum is obtained. This represents the auditory (spectral) equivalent of any acoustic sound such as speech. Auditory spectra from coded speech segments serve as inputs to a second model. This model simulates the information centre in the brain which performs the speech quality assessment. [Continues.

    Speech spectrum non-stationarity detection based on line spectrum frequencies and related applications

    Get PDF
    Ankara : Department of Electrical and Electronics Engineering and The Institute of Engineering and Sciences of Bilkent University, 1998.Thesis (Master's) -- Bilkent University, 1998.Includes bibliographical references leaves 124-132In this thesis, two new speech variation measures for speech spectrum nonstationarity detection are proposed. These measures are based on the Line Spectrum Frequencies (LSF) and the spectral values at the LSF locations. They are formulated to be subjectively meaningful, mathematically tractable, and also have low computational complexity property. In order to demonstrate the usefulness of the non-stationarity detector, two applications are presented: The first application is an implicit speech segmentation system which detects non-stationary regions in speech signal and obtains the boundaries of the speech segments. The other application is a Variable Bit-Rate Mixed Excitation Linear Predictive (VBR-MELP) vocoder utilizing a novel voice activity detector to detect silent regions in the speech. This voice activity detector is designed to be robust to non-stationary background noise and provides efficient coding of silent sections and unvoiced utterances to decrease the bit-rate. Simulation results are also presented.Ertan, Ali ErdemM.S

    Optimisation techniques for low bit rate speech coding

    Get PDF
    This thesis extends the background theory of speech and major speech coding schemes used in existing networks to an implementation of GSM full-rate speech compression on a RISC DSP and a multirate application for speech coding. Speech coding is the field concerned with obtaining compact digital representations of speech signals for the purpose of efficient transmission. In this thesis, the background of speech compression, characteristics of speech signals and the DSP algorithms used have been examined. The current speech coding schemes and requirements have been studied. The Global System for Mobile communication (GSM) is a digital mobile radio system which is extensively used throughout Europe, and also in many other parts of the world. The algorithm is standardised by the European Telecommunications Standardisation histitute (ETSI). The full-rate and half-rate speech compression of GSM have been analysed. A real time implementation of the full-rate algorithm has been carried out on a RISC processor GEPARD by Austria Mikro Systeme International (AMS). The GEPARD code has been tested with all of the test sequences provided by ETSI and the results are bit-exact. The transcoding delay is lower than the ETSI requirement. A comparison of the half-rate and full-rate compression algorithms is discussed. Both algorithms offer near toll speech quality comparable or better than analogue cellular networks. The half-rate compression requires more computationally intensive operations and therefore a more powerful processor will be needed due to the complexity of the code. Hence the cost of the implementation of half-rate codec will be considerably higher than full-rate. A description of multirate signal processing and its application on speech (SBC) and speech/audio (MPEG) has been given. An investigation into the possibility of combining multirate filtering and GSM fill-rate speech algorithm. The results showed that multirate signal processing cannot be directly applied GSM full-rate speech compression since this method requires more processing power, causing longer coding delay but did not appreciably improve the bit rate. In order to achieve a lower bit rate, the GSM full-rate mathematical algorithm can be used instead of the standardised ETSI recommendation. Some changes including the number of quantisation bits has to be made before the application of multirate signal processing and a new standard will be required

    Advanced Television and Signal Processing Program

    Get PDF
    Contains an introduction and reports on fifteen research projects.Advanced Television Research ProgramAdams-Russell Electronics, Inc.National Science Foundation Fellowship Grant MIP 87-14969National Science Foundation FellowshipU.S. Navy - Office of Naval Research Grant N00014-89-J-1489U.S. Air Force - Electronic Systems Division Contract F1 9628-89-K-004

    The development and implementation of a single-line intelligent digital telephone answering unit on a personal computer

    Get PDF
    ThesisCommercial telephone answering machines are limited to some extent by one or more of the following factors: • limited facilities • difficult to upgrade • nonstandard telephone interfacing • expensive • lack of user-friendliness • lack of dialogue and intelligence The purpose of this study is to design an intelligent digital telephone system which will overcome as many of the above-mentioned problems as possible. The following features are proposed and will be discussed: The use of a commonly available, but powerful, personal computer processor and memory instead of the elementary and rigid processor and magnetic tape storage units of the commercial telephone answering machine . This allows the quick storage and retrieval of digitized messages, each with its individual name, time and date stamp. Using the personal computer's hardware and not duplicating the processor and memory units allows a more cost-effective system upgrade. Upgrades mainly consist of software changes and minor hardware changes. This means that an upgrade does not implicate a total hardware redesign. Standards as prescribed by the local switching network standards and the Department of Post and Telecommunications, apply to this design and are applicable for licensing of the product. It is evident that the cost of this project and design is kept minimal by not duplicating expensive components like the microprocessor and the memory units, although these are used in the design. In this respect upgrades are software orientated to further limit the costs. The personal computer is equipped with a display which allows the user to make easy selections in order to execute the required instructions or to obtain information by using the help functions. This real-time help function eliminates the need for a user manual. Dialogue between user and personal computer over the telephone network offers a simple method of delivering information without the need for any extra equipment such as modems, keyboards or display units. The software used on the personal computer is designed in such a way that the system is intelligent and capable of decision making. Communication from the public telephone network is possible by using the telephone keypad and Dual Tone Multifrequency (DTMF) signalling

    Comparison of CELP speech coder with a wavelet method

    Get PDF
    This thesis compares the speech quality of Code Excited Linear Predictor (CELP, Federal Standard 1016) speech coder with a new wavelet method to compress speech. The performances of both are compared by performing subjective listening tests. The test signals used are clean signals (i.e. with no background noise), speech signals with room noise and speech signals with artificial noise added. Results indicate that for clean signals and signals with predominantly voiced components the CELP standard performs better than the wavelet method but for signals with room noise the wavelet method performs much better than the CELP. For signals with artificial noise added, the results are mixed depending on the level of artificial noise added with CELP performing better for low level noise added signals and the wavelet method performing better for higher noise levels
    • …
    corecore