Spontaneous speech resources in Japan

Abstract

National Institute for Japanese Language and LinguisticsNational Institute of InformaticsLREC 2018 Special Speech Sessions "Speech Resources Collection in Real-World Situations"; Phoenix Seagaia Conference Center, Miyazaki; 2018-05-09In this paper, we introduce representative corpora of spontaneous speech, which have been provided publically in Japan. A large amount of spontaneous speech data is required for research on various themes in speech studies such as speech analysis, speech recognition systems, and natural language processing in recent years. However, it is difficult to collect spontaneous speech data, and few corpora of spontaneous speech are available. Considering the diversity of speech in real-world situations, the data remain insufficient. We show the characteristics of spontaneous Japanese speech corpora gathered and distributed by two organizations: the Speech Resources Consortium at the National Institute of Informatics, and the National Institute for Japanese Language and Linguistics. Then, we describe prospects for the development of spontaneous speech resources

    Similar works