20 research outputs found

    An interference-free representation of group delay for periodic signals

    Get PDF
    Abstract-This article introduces a new group delay representation for periodic signals. The proposed method yields a group delay representation that is free from interferences due to repetitive excitation. Power spectrum-weighted averaged group delay using shifted copies of the weighted group delay separated by a half fundamental frequency is proven to have the desired property

    Operating a Public Spoken Guidance System in Real Environment

    Get PDF
    INTERSPEECH2005: the 9th European Conference on Speech Communication and technology, September 4-8, 2005, Lisbon, Portugal.Takemaru-kun system is a practical speech-oriented guidance system developed to examine spoken interface through long-term operation in a public place that collected natural human-machine interaction data. In (2)004 the following advances improving reliability of the system were introduced, which conduced acquiring positive increase of access from users: (1) Rejection of unintended speech based on Gaussian Mixture Models (GMMs); (2) Removal of short, unnecessary inputs of impulsive noise; (3) Child or adult user discrimination; (4) Web-based monitoring mechanisms. This paper summarizes the Takemaru-kun system and analysis of 177,789 data collected by two-years actual operation. Experiments with the collected data proved that a combination of GMM-based verification and short input removal can excise 85% of the invalid inputs, including laughter, incomprehensible utterances, and even some background utterances

    Insights Gained from Development and Long-Term Operation of a Real-Environment Speech-Oriented Guidance System

    Get PDF
    ICASSP2007: IEEE International Conference on Acoustics, Speech, and Signal Processing, April 15-20, 2007, Honolulu, Hawaii, USA.This paper presents insights gained from operating a public speech-oriented guidance system. A real-environment speech database (300 hours) collected with the system over four years is described and analyzed regarding usage frequency, content and diversity. Having the first two years of the data completely transcribed, simulation of system development and evaluation of system performance over time is possible. The database is employed for acoustic and language modeling as well as construction of a question and answer database. Since the system input is not text but speech, the database enables also research on open-domain speech-based information access. Apart from that research on unsupervised acoustic modeling, language modeling and system portability can be carried out. A performance evaluation of the system in an early stage as well as late stage when using two years of real-environment data for constructing all system components shows the relative importance of developing each system component. The system's response accuracy is 83% for adults and 68% for childre

    Development of Speech Input Method for Interactive VoiceWeb Systems

    Get PDF
    HCI International 2009: 13th International Conference on Human-Computer Interaction, July 19-24, 2009, San Diego, CA, USA.We have developed a speech input method called “w3voice” to build practical and handy voice-enabled Web applications. It is constructed using a simple Java applet and CGI programs comprising free software. In our website (http://w3voice.jp/), we have released automatic speech recognition and spoken dialogue applications that are suitable for practical use. The mechanism of voice-based interaction is developed on the basis of raw audio signal transmissions via the POST method and the redirection response of HTTP. The system also aims at organizing a voice database collected from home and office environments over the Internet. The purpose of the work is to observe actual voice interactions of human-machine and human-human. We have succeeded in acquiring 8,412 inputs (47.9 inputs per day) captured by using normal PCs over a period of seven months. The experiments confirmed the user-friendliness of our system in human-machine dialogues with trial users

    Speech-to-text input method for Web system using Javascript

    Get PDF
    SLT2008: IEEE Workshop on Spoken Language Technology, December 15-19, 2008, Goa, India.We have developed a speech-to-text input method for web systems. The system is provided as a JavaScript library including an Ajax-like mechanism based on a Java applet, CGI programs, and dynamic HTML documents. It allows users to access voice-enabled web pages without requiring special browsers. Web developers can embed it on their web page by inserting only one line in the header field of an HTML document. This study also aims at observing natural spoken interactions in personal environments. We have succeeded in collecting 4,003 inputs during a period of seven months via our public Japanese ASR server. In order to cover out-of-vocabulary words to cope with some proper nouns, a web page to register new words into the language model are developed. As a result, we could obtain an improvement of 0.8% in the recognition accuracy. With regard to the acoustical conditions, an SNR of 25.3 dB was observed
    corecore