8 research outputs found

    Perceptual models in speech quality assessment and coding

    Get PDF
    The ever-increasing demand for good communications/toll quality speech has created a renewed interest into the perceptual impact of rate compression. Two general areas are investigated in this work, namely speech quality assessment and speech coding. In the field of speech quality assessment, a model is developed which simulates the processing stages of the peripheral auditory system. At the output of the model a "running" auditory spectrum is obtained. This represents the auditory (spectral) equivalent of any acoustic sound such as speech. Auditory spectra from coded speech segments serve as inputs to a second model. This model simulates the information centre in the brain which performs the speech quality assessment. [Continues.

    Progressive Source-Channel Coding for Multimedia Transmission over Noisy and Lossy Channels with and without Feedback

    Get PDF
    Rate-scalable or layered lossy source-coding is useful for progressive transmission of multimedia sources, where the receiver can reconstruct the source incrementally. This thesis considers ``joint source-channel'' schemes for such a progressive transmission, in the presence of noise or loss, with and without the use of a feedback link. First we design image communication schemes for memoryless and finite state channels using limited and explicitly constrained use of the feedback channel in the form of a variable incremental redundancy Hybrid ARQ protocol. Constraining feedback allows a direct comparison with schemes without feedback. Optimized feedback based systems are shown to have useful gains. Second, we develop a controlled Markov chain approach for constrained feedback Hybrid ARQ protocol design. The proposed methodology allows the protocol to be chosen from a collection of signal flow graphs, and also allows explicit control over the tradeoffs in throughput, reliability and complexity. Next we consider progressive image transmission in the absence of feedback. We assign unequal error protection to the bits of a rate-scalable source-coder using rate compatible channel codes. We show that, under the framework, the source and channel bits can be ``scheduled'' in a single bitstream in such a way that operational optimality is retained for different transmission budgets, creating a rate-scalable joint source-channel coder. Next we undertake the design of a joint source-channel decoder that uses ``distortion aware'' ACK/NACK feedback generation. For memoryless channels, and Type-I HARQ, the design of optimal ACK/NACK generation and decoding by packet combining is cast and solved as a sequential decision problem. We obtain dynamic programming based optimal solutions and also propose suboptimal, lower complexity distortion-aware decoders and feedback generation rules which outperform conventional BER based rules such as CRC-check. Finally we design operational rate-distortion optimal ACK/NACK feedback generation rules for transmitting a tree structured quantizer over a memoryless channel. We show that the optimal feedback generation rules are embedded, that is, they allow incremental switching to higher rates during the transmission. Also, we obtain the structure of the feedback generation rules in terms of a feedback threshold function that simplifies the implementation

    FeedNetBack - D03.02 - Control Subject to Transmission Constraints, With Transmission Errors

    Get PDF
    This is a Deliverable Report for the FeedNetBack project (www.feednetback.eu). It describes the research performed within Work Package 3, Task 3.2 (Control Subject to Transmission Constraints, with Transmission Errors), in the first 36 months of the project. It targets the issue of control subject to transmission constraints with transmission error. This research concerns problems arising from the presence of a noisy communication channel (specified and modeled at the physical layer) within the control loop. The resulting constraints include finite capacities in the transmission of the sensor and/or actuator signals and transmission errors. Our focus is on designing new compression and coding techniques to support networked control in this scenario. This Deliverable extends the analysis provided in the companion Deliverable D03.01, to deal with the effects of noise in communication channel. The quantization schemes described in D03.01, in particular the adaptive ones, might be very sensitive to the presence of even a few errors. Indeed error-correction coding for estimation or control purposes cannot simply exploit classical coding theory and practice, where vanishing error probability is obtained only in the limit of infinite block-length. A first contribution reported in this Deliverable is the construction of families of codes having the any-time property required in this setting, and the analysis of the trade-off between code complexity and performance. Our results consider the binary erasure channel, and can be extended to more general binary-input output-symmetric memoryless channels. The second and third contributions reported in this deliverable deal with the problem of remotely stabilizing linear time invariant (LTI) systems over Gaussian channels. Specifically, in the second contribution we consider a single LTI system which has to be stabilized by remote controller using a network of sensors having average transmit power constraints. We study basic sensor network topologies and provide necessary and sufficient conditions for mean square stabilization. Then in the third contribution, we extend our study to two LTI systems which are to be simultaneously stabilized. In this regard, we study the interesting setups of joint and separate sensing and control. By joint sensing we mean that there exists a common sensor node to simultaneously transmit the sensed state processes of the two plants and by joint control we mean that there is a common controller for both plants. We name these setups as: i) control over multiple-access channel (separate sensors, joint controller setup), ii) control over broadcast channel (common sensor, separate controllers setup), and iii) control over interference channel (separate sensors, separate controllers). We propose to use delay-free linear schemes for these setups and thus obtain sufficient conditions for mean square stabilization. Then, we discuss the joint design of the encoder and the controller. We propose an iterative design procedure for a joint design of the sensor measurement quantization, channel error protection, and controller actuation, with the objective to minimize the expected linear quadratic cost over a finite horizon. Finally, the same as for the noiseless case, we address the issues that arise when not only one plant and one controller are communicating through a channel, but there is a whole network of sensors and actuators. We consider the effects of digital noisy channels on the consensus algorithm, and we present an algorithm which exploits the any-time codes discussed above

    The Telecommunications and Data Acquisition Report

    Get PDF
    This publication, one of a series formerly titled The Deep Space Network Progress Report, documents DSN progress in flight project support, tracking and data acquisition research and technology, network engineering, hardware and software implementation, and operations. In addition, developments in Earth-based radio technology as applied to geodynamics, astrophysics and the radio search for extraterrestrial intelligence are reported

    Support Vector Machines for Speech Recognition

    Get PDF
    Hidden Markov models (HMM) with Gaussian mixture observation densities are the dominant approach in speech recognition. These systems typically use a representational model for acoustic modeling which can often be prone to overfitting and does not translate to improved discrimination. We propose a new paradigm centered on principles of structural risk minimization using a discriminative framework for speech recognition based on support vector machines (SVMs). SVMs have the ability to simultaneously optimize the representational and discriminative ability of the acoustic classifiers. We have developed the first SVM-based large vocabulary speech recognition system that improves performance over traditional HMM-based systems. This hybrid system achieves a state-of-the-art word error rate of 10.6% on a continuous alphadigit task ? a 10% improvement relative to an HMM system. On SWITCHBOARD, a large vocabulary task, the system improves performance over a traditional HMM system from 41.6% word error rate to 40.6%. This dissertation discusses several practical issues that arise when SVMs are incorporated into the hybrid system

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes
    corecore