Search CORE

20 research outputs found

Humanoid with Interaction Ability Using Vision and Speech Information

Author: Junichi Ido
Ryuichi Nisimura
Tsukasa Ogasawara
Yoshio Matsumoto
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

An interference-free representation of group delay for periodic signals

Author: Hideki Kawahara
Masanori Morise
Ryuichi Nisimura
Toshio Irino
Publication venue
Publication date: 24/04/2020
Field of study

Abstract-This article introduces a new group delay representation for periodic signals. The proposed method yields a group delay representation that is free from interferences due to repetitive excitation. Power spectrum-weighted averaged group delay using shifted copies of the weighted group delay separated by a half fundamental frequency is proven to have the desired property

CiteSeerX

Public Speech-Oriented Guidance System with Adult and Child Discrimination Capability

Author: Akinobu Lee
Hiroshi Saruwatari
Kiyohiro Shikano
Ryuichi Nisimura
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2023
Field of study

Institutional Repositories DataBase (IRDB)

Insights Gained from Development and Long-Term Operation of a Real-Environment Speech-Oriented Guidance System

Author: Cincarek Tobias
Lee Akinobu
Nisimura Ryuichi
Shikano Kiyohiro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/03/2023
Field of study

Institutional Repositories DataBase (IRDB)

Operating a Public Spoken Guidance System in Real Environment

Author: Akinobu Lee
Kiyohiro Shikano
Masashi Yamada
Ryuichi Nisimura
Publication venue
Publication date: 01/09/2005
Field of study

INTERSPEECH2005: the 9th European Conference on Speech Communication and technology, September 4-8, 2005, Lisbon, Portugal.Takemaru-kun system is a practical speech-oriented guidance system developed to examine spoken interface through long-term operation in a public place that collected natural human-machine interaction data. In (2)004 the following advances improving reliability of the system were introduced, which conduced acquiring positive increase of access from users: (1) Rejection of unintended speech based on Gaussian Mixture Models (GMMs); (2) Removal of short, unnecessary inputs of impulsive noise; (3) Child or adult user discrimination; (4) Web-based monitoring mechanisms. This paper summarizes the Takemaru-kun system and analysis of 177,789 data collected by two-years actual operation. Experiments with the collected data proved that a combination of GMM-based verification and short input removal can excise 85% of the invalid inputs, including laughter, incomprehensible utterances, and even some background utterances

NAIST Academic Repository

Insights Gained from Development and Long-Term Operation of a Real-Environment Speech-Oriented Guidance System

Author: Cincarek Tobias
Lee Akinobu
Nisimura Ryuichi
Shikano、Kiyohiro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2007
Field of study

ICASSP2007: IEEE International Conference on Acoustics, Speech, and Signal Processing, April 15-20, 2007, Honolulu, Hawaii, USA.This paper presents insights gained from operating a public speech-oriented guidance system. A real-environment speech database (300 hours) collected with the system over four years is described and analyzed regarding usage frequency, content and diversity. Having the first two years of the data completely transcribed, simulation of system development and evaluation of system performance over time is possible. The database is employed for acoustic and language modeling as well as construction of a question and answer database. Since the system input is not text but speech, the database enables also research on open-domain speech-based information access. Apart from that research on unsupervised acoustic modeling, language modeling and system portability can be carried out. A performance evaluation of the system in an early stage as well as late stage when using two years of real-environment data for constructing all system components shows the relative importance of developing each system component. The system's response accuracy is 83% for adults and 68% for childre

NAIST Academic Repository

Controlling linguistic information and filtered sound identity for a new cross-synthesis vocoder

Author: Hideki Kawahara
Ryuichi Nisimura
Taiki Nishi
Toshio Irino
Publication venue: 'Acoustical Society of Japan'
Publication date: 01/01/2013
Field of study

Crossref

Development of Speech Input Method for Interactive VoiceWeb Systems

Author: Hideki Kawahara
Jumpei Miyake
Ryuichi Nisimura
Toshio Irino
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2009
Field of study

HCI International 2009: 13th International Conference on Human-Computer Interaction, July 19-24, 2009, San Diego, CA, USA.We have developed a speech input method called “w3voice” to build practical and handy voice-enabled Web applications. It is constructed using a simple Java applet and CGI programs comprising free software. In our website (http://w3voice.jp/), we have released automatic speech recognition and spoken dialogue applications that are suitable for practical use. The mechanism of voice-based interaction is developed on the basis of raw audio signal transmissions via the POST method and the redirection response of HTTP. The system also aims at organizing a voice database collected from home and office environments over the Internet. The purpose of the work is to observe actual voice interactions of human-machine and human-human. We have succeeded in acquiring 8,412 inputs (47.9 inputs per day) captured by using normal PCs over a period of seven months. The experiments confirmed the user-friendliness of our system in human-machine dialogues with trial users

NAIST Academic Repository

Speech-to-text input method for Web system using Javascript

Author: Hideki Kawahara
Jumpei Miyake
Ryuichi Nisimura
Toshio Irino
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2008
Field of study

SLT2008: IEEE Workshop on Spoken Language Technology, December 15-19, 2008, Goa, India.We have developed a speech-to-text input method for web systems. The system is provided as a JavaScript library including an Ajax-like mechanism based on a Java applet, CGI programs, and dynamic HTML documents. It allows users to access voice-enabled web pages without requiring special browsers. Web developers can embed it on their web page by inserting only one line in the header field of an HTML document. This study also aims at observing natural spoken interactions in personal environments. We have succeeded in collecting 4,003 inputs during a period of seven months via our public Japanese ASR server. In order to cover out-of-vocabulary words to cope with some proper nouns, a web page to register new words into the language model are developed. As a result, we could obtain an improvement of 0.8% in the recognition accuracy. With regard to the acoustical conditions, an SNR of 25.3 dB was observed

NAIST Academic Repository

Crossref

Operating a Public Spoken Guidance System in Real Environment

Author: Akinobu Lee
Kiyohiro Shikano
Masashi Yamada
Ryuichi Nisimura
Publication venue
Publication date: 02/03/2023
Field of study

Institutional Repositories DataBase (IRDB)