28 research outputs found

    How does a dictation machine recognize speech?

    Get PDF
    There is magic (or is it witchcraft?) in a speech recognizer that transcribes continuous radio speech into text with a word accuracy of even not more than 50%. The extreme difficulty of this task, tough, is usually not perceived by the general public. This is because we are almost deaf to the infinite acoustic variations that accompany the production of vocal sounds, which arise from physiological constraints (co-articulation), but also from the acoustic environment (additive or convolutional noise, Lombard effect), or from the emotional state of the speaker (voice quality, speaking rate, hesitations, etc.)46. Our consciousness of speech is indeed not stimulated until after it has been processed by our brain to make it appear as a sequence of meaningful units: phonemes and words. In this Chapter we will see how statistical pattern recognition and statistical sequence recognition techniques are currently used for trying to mimic this extraordinary faculty of our mind (4.1). We will follow, in Section 4.2, with a MATLAB-based proof of concept of word-based automatic speech recognition (ASR) based on Hidden Markov Models (HMM), using a bigram model for modeling (syntactic-semantic) language constraints

    Acoustic and device feature fusion for load recognition

    Full text link
    Appliance-specific Load Monitoring (LM) provides a possible solution to the problem of energy conservation which is becoming increasingly challenging, due to growing energy demands within offices and residential spaces. It is essential to perform automatic appliance recognition and monitoring for optimal resource utilization. In this paper, we study the use of non-intrusive LM methods that rely on steady-state appliance signatures for classifying most commonly used office appliances, while demonstrating their limitation in terms of accurately discerning the low-power devices due to overlapping load signatures. We propose a multi-layer decision architecture that makes use of audio features derived from device sounds and fuse it with load signatures acquired from energy meter. For the recognition of device sounds, we perform feature set selection by evaluating the combination of time-domain and FFT-based audio features on the state of the art machine learning algorithms. Further, we demonstrate that our proposed feature set which is a concatenation of device audio feature and load signature significantly improves the device recognition accuracy in comparison to the use of steady-state load signatures only

    Continuous Speech Recognition Using Dynamic Bayesian Networks : A Fast Decoding Algorithm

    No full text
    Contribution Ă  un ouvrage.State-of-the-art automatic speech recognition systems are based on probabilistic modeling of the speech signal using Hidden Markov Models (HMMs). Recent work has focused on the use of dynamic Bayesian networks (DBNs) framework to construct new acoustic models to overcome the limitations of HMM based systems. In this line of research we proposed a methodology to learn the conditional independence assertions of acoustic models based on structural learning of DBNs. In previous work, we evaluated this approach for simple isolated and connected digit recognition tasks. In this paper we evaluate our approach for a more complex task: continuous phoneme recognition. For this purpose, we propose a new decoding algorithm based on dynamic programming. The proposed algorithm decreases the computational complexity of decoding and hence enables the application of the approach to complex speech recognition tasks
    corecore