117 research outputs found

    Speech recognition on DSP: algorithm optimization and performance analysis.

    Get PDF
    Yuan Meng.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 85-91).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- History of ASR development --- p.2Chapter 1.2 --- Fundamentals of automatic speech recognition --- p.3Chapter 1.2.1 --- Classification of ASR systems --- p.3Chapter 1.2.2 --- Automatic speech recognition process --- p.4Chapter 1.3 --- Performance measurements of ASR --- p.7Chapter 1.3.1 --- Recognition accuracy --- p.7Chapter 1.3.2 --- Complexity --- p.7Chapter 1.3.3 --- Robustness --- p.8Chapter 1.4 --- Motivation and goal of this work --- p.8Chapter 1.5 --- Thesis outline --- p.10Chapter 2 --- Signal processing techniques for front-end --- p.12Chapter 2.1 --- Basic feature extraction principles --- p.13Chapter 2.1.1 --- Pre-emphasis --- p.13Chapter 2.1.2 --- Frame blocking and windowing --- p.13Chapter 2.1.3 --- Discrete Fourier Transform (DFT) computation --- p.15Chapter 2.1.4 --- Spectral magnitudes --- p.15Chapter 2.1.5 --- Mel-frequency filterbank --- p.16Chapter 2.1.6 --- Logarithm of filter energies --- p.18Chapter 2.1.7 --- Discrete Cosine Transformation (DCT) --- p.18Chapter 2.1.8 --- Cepstral Weighting --- p.19Chapter 2.1.9 --- Dynamic featuring --- p.19Chapter 2.2 --- Practical issues --- p.20Chapter 2.2.1 --- Review of practical problems and solutions in ASR appli- cations --- p.20Chapter 2.2.2 --- Model of environment --- p.23Chapter 2.2.3 --- End-point detection (EPD) --- p.23Chapter 2.2.4 --- Spectral subtraction (SS) --- p.25Chapter 3 --- HMM-based Acoustic Modeling --- p.26Chapter 3.1 --- HMMs for ASR --- p.26Chapter 3.2 --- Output probabilities --- p.27Chapter 3.3 --- Viterbi search engine --- p.29Chapter 3.4 --- Isolated word recognition (IWR) & Connected word recognition (CWR) --- p.30Chapter 3.4.1 --- Isolated word recognition --- p.30Chapter 3.4.2 --- Connected word recognition (CWR) --- p.31Chapter 4 --- DSP for embedded applications --- p.32Chapter 4.1 --- "Classification of embedded systems (DSP, ASIC, FPGA, etc.)" --- p.32Chapter 4.2 --- Description of hardware platform --- p.34Chapter 4.3 --- I/O operation for real-time processing --- p.36Chapter 4.4 --- Fixed point algorithm on DSP --- p.40Chapter 5 --- ASR algorithm optimization --- p.42Chapter 5.1 --- Methodology --- p.42Chapter 5.2 --- Floating-point to fixed-point conversion --- p.43Chapter 5.3 --- Computational complexity consideration --- p.45Chapter 5.3.1 --- Feature extraction techniques --- p.45Chapter 5.3.2 --- Viterbi search module --- p.50Chapter 5.4 --- Memory requirements consideration --- p.51Chapter 6 --- Experimental results and performance analysis --- p.53Chapter 6.1 --- Cantonese isolated word recognition (IWR) --- p.54Chapter 6.1.1 --- Execution time --- p.54Chapter 6.1.2 --- Memory requirements --- p.57Chapter 6.1.3 --- Recognition performance --- p.57Chapter 6.2 --- Connected word recognition (CWR) --- p.61Chapter 6.2.1 --- Execution time consideration --- p.62Chapter 6.2.2 --- Recognition performance --- p.62Chapter 6.3 --- Summary & discussion --- p.66Chapter 7 --- Implementation of practical techniques --- p.67Chapter 7.1 --- End-point detection (EPD) --- p.67Chapter 7.2 --- Spectral subtraction (SS) --- p.71Chapter 7.3 --- Experimental results --- p.72Chapter 7.3.1 --- Isolated word recognition (IWR) --- p.72Chapter 7.3.2 --- Connected word recognition (CWR) --- p.75Chapter 7.4 --- Results --- p.77Chapter 8 --- Conclusions and future work --- p.78Chapter 8.1 --- Summary and Conclusions --- p.78Chapter 8.2 --- Suggestions for future research --- p.80Appendices --- p.82Chapter A --- "Interpolation of data entries without floating point, divides or conditional branches" --- p.82Chapter B --- Vocabulary for Cantonese isolated word recognition task --- p.84Bibliography --- p.8

    Security and privacy problems in voice assistant applications: A survey

    Get PDF
    Voice assistant applications have become omniscient nowadays. Two models that provide the two most important functions for real-life applications (i.e., Google Home, Amazon Alexa, Siri, etc.) are Automatic Speech Recognition (ASR) models and Speaker Identification (SI) models. According to recent studies, security and privacy threats have also emerged with the rapid development of the Internet of Things (IoT). The security issues researched include attack techniques toward machine learning models and other hardware components widely used in voice assistant applications. The privacy issues include technical-wise information stealing and policy-wise privacy breaches. The voice assistant application takes a steadily growing market share every year, but their privacy and security issues never stopped causing huge economic losses and endangering users' personal sensitive information. Thus, it is important to have a comprehensive survey to outline the categorization of the current research regarding the security and privacy problems of voice assistant applications. This paper concludes and assesses five kinds of security attacks and three types of privacy threats in the papers published in the top-tier conferences of cyber security and voice domain

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Machine Learning for Microcontroller-Class Hardware -- A Review

    Full text link
    The advancements in machine learning opened a new opportunity to bring intelligence to the low-end Internet-of-Things nodes such as microcontrollers. Conventional machine learning deployment has high memory and compute footprint hindering their direct deployment on ultra resource-constrained microcontrollers. This paper highlights the unique requirements of enabling onboard machine learning for microcontroller class devices. Researchers use a specialized model development workflow for resource-limited applications to ensure the compute and latency budget is within the device limits while still maintaining the desired performance. We characterize a closed-loop widely applicable workflow of machine learning model development for microcontroller class devices and show that several classes of applications adopt a specific instance of it. We present both qualitative and numerical insights into different stages of model development by showcasing several use cases. Finally, we identify the open research challenges and unsolved questions demanding careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa

    Secure mobile radio communication over narrowband RF channel.

    Get PDF
    by Wong Chun Kau, Jolly.Thesis (M.Phil.)--Chinese University of Hong Kong, 1992.Includes bibliographical references (leaves 84-88).ABSTRACT --- p.1ACKNOWLEDGEMENT --- p.3Chapter 1. --- INTRODUCTION --- p.7Chapter 1.1 --- Land Mobile Radio (LMR) CommunicationsChapter 1.2 --- Paramilitary Communications SecurityChapter 1.3 --- Voice Scrambling MethodsChapter 1.4 --- Digital Voice EncryptionChapter 1.5 --- Digital Secure LMRChapter 2. --- DESIGN GOALS --- p.20Chapter 2.1 --- System Concept and ConfigurationChapter 2.2 --- Operational RequirementsChapter 2.2.1 --- Operating conditionsChapter 2.2.2 --- Intelligibility and speech qualityChapter 2.2.3 --- Field coverage and transmission delayChapter 2.2.4 --- Reliability and maintenanceChapter 2.3 --- Functional RequirementsChapter 2.3.1 --- Major system featuresChapter 2.3.2 --- Cryptographic featuresChapter 2.3.3 --- Phone patch facilityChapter 2.3.4 --- Mobile data capabilityChapter 2.4 --- Bandwidth RequirementsChapter 2.5 --- Bit Error Rate RequirementsChapter 3. --- VOICE CODERS --- p.38Chapter 3.1 --- Digital Speech Coding MethodsChapter 3.1.1 --- Waveform codingChapter 3.1.2 --- Linear predictive codingChapter 3.1.3 --- Sub-band codingChapter 3.1.4 --- VocodersChapter 3.2 --- Performance EvaluationChapter 4. --- CRYPTOGRAPHIC CONCERNS --- p.52Chapter 4.1 --- Basic Concepts and CryptoanalysisChapter 4.2 --- Digital Encryption TechniquesChapter 4.3 --- Crypto SynchronizationChapter 4.3.1 --- Auto synchronizationChapter 4.3.2 --- Initial synchronizationChapter 4.3.3 --- Continuous synchronizationChapter 4.3.4 --- Hybrid synchronizationChapter 5. --- DIGITAL MODULATION --- p.63Chapter 5.1 --- Narrowband Channel RequirementsChapter 5.2 --- Narrowband Digital FMChapter 5.3 --- Performance EvaluationChapter 6. --- SYSTEM IMPLEMENTATION --- p.71Chapter 6.1 --- Potential EMC ProblemsChapter 6.2 --- Frequency PlanningChapter 6.3 --- Key ManagementChapter 6.4 --- Potential Electromagnetic Compatibility (EMC) ProblemsChapter 7. --- CONCLUSION --- p.80LIST OF ILLUSTRATIONS --- p.81REFERENCES --- p.82APPENDICES --- p.89Chapter I. --- Path Propagation Loss(L) Vs Distance (d)Chapter II. --- Speech Quality Assessment Tests performedby Special Duties Unit (SDU

    Proceedings of the Second International Mobile Satellite Conference (IMSC 1990)

    Get PDF
    Presented here are the proceedings of the Second International Mobile Satellite Conference (IMSC), held June 17-20, 1990 in Ottawa, Canada. Topics covered include future mobile satellite communications concepts, aeronautical applications, modulation and coding, propagation and experimental systems, mobile terminal equipment, network architecture and control, regulatory and policy considerations, vehicle antennas, and speech compression

    The Design and Application of an Acoustic Front-End for Use in Speech Interfaces

    Get PDF
    This thesis describes the design, implementation, and application of an acoustic front-end. Such front-ends constitute the core of automatic speech recognition systems. The front-end whose development is reported here has been designed for speaker-independent large vocabulary recognition. The emphasis of this thesis is more one of design than of application. This work exploits the current state-of-the-art in speech recognition research, for example, the use of Hidden Markov Models. It describes the steps taken to build a speaker-independent large vocabulary system from signal processing, through pattern matching, to language modelling. An acoustic front-end can be considered as a multi-stage process, each of which requires the specification of many parameters. Some parameters have fundamental consequences for the ultimate application of the front-end. Therefore, a major part of this thesis is concerned with their analysis and specification. Experiments were carried out to determine the characteristics of individual parameters, the results of which were then used to motivate particular parameter settings. The thesis concludes with some applications that point out, not only the power of the resulting acoustic front-end, but also its limitations
    • …
    corecore