456 research outputs found

    A REVIEW ON VOICE ACTIVITY DETECTION AND MEL-FREQUENCY CEPSTRAL COEFFICIENTS FOR SPEAKER RECOGNITION (TREND ANALYSIS)

    Get PDF
    ABSTRACTObjective: The objective of this review article is to give a complete review of various techniques that are used for speech recognition purposes overtwo decades.Methods: VAD-Voice Activity Detection, SAD-Speech Activity Detection techniques are discussed that are used to distinguish voiced from unvoicedsignals and MFCC- Mel Frequency Cepstral Coefficient technique is discussed which detects specific features.Results: The review results show that research in MFCC has been dominant in signal processing in comparison to VAD and other existing techniques.Conclusion: A comparison of different speaker recognition techniques that were used previously were discussed and those in current research werealso discussed and a clear idea of the better technique was identified through the review of multiple literature for over two decades.Keywords: Cepstral analysis, Mel-frequency cepstral coefficients, signal processing, speaker recognition, voice activity detection

    Design of automatic speech recognition in noisy environments enhancement and modification

    Get PDF
    Recurrent neural networks (RNN) and feed-forward multi-layer perceptron’s have been proposed for determining the absence and presence of speech in continuous voice signals when there is a variety of background noise levels present. The Aurora2 and Aurora3 were used to conduct detailed performance evaluations on vocal activity detection. When a Recurrent neural network feeds on automatic speech recognition particular features and acoustic features, the best outcomes can be achieved, according to this study. Aurora2 and the French, Romanian and Norway portions of the Aurora3 corpus are also proposed for detailed studies of ASR. When noise presence probability is utilized to change for encoding speech, phone subsequent probabilities are employed; the WER is reduced by 10.3 percent

    Spread spectrum-based video watermarking algorithms for copyright protection

    Get PDF
    Merged with duplicate record 10026.1/2263 on 14.03.2017 by CS (TIS)Digital technologies know an unprecedented expansion in the last years. The consumer can now benefit from hardware and software which was considered state-of-the-art several years ago. The advantages offered by the digital technologies are major but the same digital technology opens the door for unlimited piracy. Copying an analogue VCR tape was certainly possible and relatively easy, in spite of various forms of protection, but due to the analogue environment, the subsequent copies had an inherent loss in quality. This was a natural way of limiting the multiple copying of a video material. With digital technology, this barrier disappears, being possible to make as many copies as desired, without any loss in quality whatsoever. Digital watermarking is one of the best available tools for fighting this threat. The aim of the present work was to develop a digital watermarking system compliant with the recommendations drawn by the EBU, for video broadcast monitoring. Since the watermark can be inserted in either spatial domain or transform domain, this aspect was investigated and led to the conclusion that wavelet transform is one of the best solutions available. Since watermarking is not an easy task, especially considering the robustness under various attacks several techniques were employed in order to increase the capacity/robustness of the system: spread-spectrum and modulation techniques to cast the watermark, powerful error correction to protect the mark, human visual models to insert a robust mark and to ensure its invisibility. The combination of these methods led to a major improvement, but yet the system wasn't robust to several important geometrical attacks. In order to achieve this last milestone, the system uses two distinct watermarks: a spatial domain reference watermark and the main watermark embedded in the wavelet domain. By using this reference watermark and techniques specific to image registration, the system is able to determine the parameters of the attack and revert it. Once the attack was reverted, the main watermark is recovered. The final result is a high capacity, blind DWr-based video watermarking system, robust to a wide range of attacks.BBC Research & Developmen
    • …
    corecore