5 research outputs found
Acoustic Beam forming and Speech Recognition using Microphone Array
This report contains a piece of work on array signal processing for microphone array beamforming and its usability in NI PCI 4461 data acquisition system. Microphone arrays have great potential in practical applications of speech processing, due to their ability to provide both noise robustness and hands-free signal acquisition. Here for sound and vibration analysis we require data acquisition systems and this data acquisition system consists of sensors DAQ systems and processer with programmable software and here we have used NI PCI 4461 system to study sound using two microphones. Furthermore this report also presents the work on fundamental speech recognition process where we can verify that the speaker by testing phase and training phase
Recommended from our members
The Challenge of Spoken Language Systems: Research Directions for the Nineties
A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the person's words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area
Recommended from our members
The Challenge of Spoken Language Systems: Research Directions for the Nineties
A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the person's words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area
STATISTICAL MODELS FOR CONSTANT FALSE-ALARM RATE THRESHOLD ESTIMATION IN SOUND SOURCE DETECTION SYSTEMS
Constant False Alarm Rate (CFAR) Processors are important for applications where thousands of detection tests are made per second, such as in radar. This thesis introduces a new method for CFAR threshold estimation that is particularly applicable to sound source detection with distributed microphone systems. The novel CFAR Processor exploits the near symmetry about 0 for the acoustic pixel values created by steered-response coherent power in conjunction with a partial whitening preprocessor to estimate thresholds for positive values, which represent potential targets.
To remove the low frequency components responsible for degrading CFAR performance, fixed and adaptive high-pass filters are applied. A relation is proposed and it tested the minimum high-pass cut-off frequency and the microphone geometry.
Experimental results for linear, perimeter and planar arrays illustrate that for desired false alarm (FA) probabilities ranging from 10-1 and 10-6, a good CFAR performance can be achieved by modeling the coherent power with Chi-square and Weibull distributions and the ratio of desired over experimental FA probabilities can be limited within an order of magnitude
Studies on noise robust automatic speech recognition
Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK