Search CORE

34 research outputs found

What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis

Author: Asami Taichi
Ashihara Takanori
Delcroix Marc
Ijima Yusuke
Matsuura Kohei
Moriya Takafumi
Publication venue
Publication date: 31/01/2024
Field of study

Self-supervised learning (SSL) has attracted increased attention for learning meaningful speech representations. Speech SSL models, such as WavLM, employ masked prediction training to encode general-purpose representations. In contrast, speaker SSL models, exemplified by DINO-based models, adopt utterance-level training objectives primarily for speaker representation. Understanding how these models represent information is essential for refining model efficiency and effectiveness. Unlike the various analyses of speech SSL, there has been limited investigation into what information speaker SSL captures and how its representation differs from speech SSL or other fully-supervised speaker models. This paper addresses these fundamental questions. We explore the capacity to capture various speech properties by applying SUPERB evaluation probing tasks to speech and speaker SSL models. We also examine which layers are predominantly utilized for each task to identify differences in how speech is represented. Furthermore, we conduct direct comparisons to measure the similarities between layers within and across models. Our analysis unveils that 1) the capacity to represent content information is somewhat unrelated to enhanced speaker representation, 2) specific layers of speech SSL models would be partly specialized in capturing linguistic information, and 3) speaker SSL models tend to disregard linguistic information but exhibit more sophisticated speaker representation.Comment: Accepted at ICASSP 202

arXiv.org e-Print Archive

Hierarchical Latent Words Language Models for Robust Modeling to Out-Of Domain Tasks

Author: Akinori Ito
Hirokazu Masataki
Ryo Masumura
Sumitaka Sakauchi
Taichi Asami
Takanobu Oba
Publication venue
Publication date: 24/04/2020
Field of study

Abstract This paper focuses on language modeling with adequate robustness to support different domain tasks. To this end, we propose a hierarchical latent word language model (h-LWLM). The proposed model can be regarded as a generalized form of the standard LWLMs. The key advance is introducing a multiple latent variable space with hierarchical structure. The structure can flexibly take account of linguistic phenomena not present in the training data. This paper details the definition as well as a training method based on layer-wise inference and a practical usage in natural language processing tasks with an approximation technique. Experiments on speech recognition show the effectiveness of h-LWLM in out-of domain tasks

CiteSeerX

CRISPR/Cas9 mediated genome editing in ES cells and its application for chimeric analysis in mice

Author: Fujihara Yoshitaka
Ikawa Masahito
Isotani Ayako
Kim Yeon Joo
Matsumura Takafumi
Miyata Haruhiko
Muto Masanaga
Noda Taichi
Nozawa Kaori
Oji Asami
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/08/2016
Field of study

Oji, A., Noda, T., Fujihara, Y. et al. CRISPR/Cas9 mediated genome editing in ES cells and its application for chimeric analysis in mice. Sci Rep 6, 31666 (2016). https://doi.org/10.1038/srep3166

Osaka University Knowledge Archive

Spermatozoa lacking Fertilization Influencing Membrane Protein (FIMP) fail to fuse with oocytes in mice

Author: Fujihara Yoshitaka
Ikawa Masahito
Kojima-Kita Kanako
Larasati Tamara
Lu Yonggang
Matzuk Martin M.
Matzuk Ryan M.
Noda Taichi
Oji Asami
Yu Zhifeng
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 28/04/2020
Field of study

Fujihara, Y., Lu, Y., Noda, T., Oji, A., Larasati, T., Kojima-Kita, K., . . . Ikawa, M. (2020). Spermatozoa lacking fertilization influencing membrane protein (FIMP) fail to fuse with oocytes in mice. Proceedings of the National Academy of Sciences of the United States of America, 117(17), 9393-9400. doi:10.1073/pnas.191706011

Osaka University Knowledge Archive

Identification of multiple male reproductive tractspecific proteins that regulate sperm migration through the oviduct in mice

Author: Fujihara Yoshitaka
Ikawa Masahito
Kobayashi Kiyonori
Kobayashi Sumire
Kojima-Kita Kanako
Larasati Tamara
Matsumura Takafumi
Matzuk Martin M.
Noda Taichi
Oji Asami
Oura Seiya
Yu Zhifeng
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 10/09/2019
Field of study

Fujihara, Y., Noda, T., Kobayashi, K., Oji, A., Kobayashi, S., Matsumura, T., . . . Ikawa, M. (2019). Identification of multiple male reproductive tractspecific proteins that regulate sperm migration through the oviduct in mice. Proceedings of the National Academy of Sciences of the United States of America, 116(37), 18498-18506. doi:10.1073/pnas.190873611

Osaka University Knowledge Archive

A Study on Reliability Improvement of Speech Recognizer's Outputs for Spoken Document Processing

Author: Asami Taichi
浅見太一
Publication venue
Publication date: 20/05/2019
Field of study

Institutional Repositories DataBase (IRDB)

Noise-Robust Speaker Verification Using F 0 Features

Author: Koji Iwano
Koji Iwano Taichi
Sadaoki Furui
Taichi Asami
Publication venue
Publication date
Field of study

This paper proposes a noise-robust speaker verification method augmented by fundamental frequency (F 0 ). The paper first describes a noise-robust F0 extraction method using the Hough transform. Then, it proposes a robust speaker verification method using multi-stream HMMs which fuse the extracted F 0 and cepstral features. Experiments are conducted using fourconnected -digit utterances of Japanese by 37 male speakers recorded at five sessions over a half year period. The utterances are contaminated with white noise at various SNR levels. Experimental results show that the F0 features improve the verification performance in all SNR conditions

CiteSeerX

An Improved Approximation Algorithm for Wage Determination and Online Task Allocation in Crowd-Sourcing

Author: Akagi Yasunori
Asami Taichi
Hikima Yuya
Kim Hideaki
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 26/06/2023
Field of study

Crowd-sourcing has attracted much attention due to its growing importance to society, and numerous studies have been conducted on task allocation and wage determination. Recent works have focused on optimizing task allocation and workers' wages, simultaneously. However, existing methods do not provide good solutions for real-world crowd-sourcing platforms due to the low approximation ratio or myopic problem settings. We tackle an optimization problem for wage determination and online task allocation in crowd-sourcing and propose a fast 1-1/(k+3)^(1/2)-approximation algorithm, where k is the minimum of tasks' budgets (numbers of possible assignments). This approximation ratio is greater than or equal to the existing method. The proposed method reduces the tackled problem to a non-convex multi-period continuous optimization problem by approximating the objective function. Then, the method transforms the reduced problem into a minimum convex cost flow problem, which is a well-known combinatorial optimization problem, and solves it by the capacity scaling algorithm. Synthetic experiments and simulation experiments using real crowd-sourcing data show that the proposed method solves the problem faster and outputs higher objective values than existing methods

Association for the Advancement of Artificial Intelligence: AAAI Publications

Viterbi Approximation of Latent Words Language Models for Automatic Speech Recognition

Author: Hirokazu Masataki
Ryo Masumura
Sumitaka Sakauchi
Taichi Asami
Takanobu Oba
Publication venue: 'Information Processing Society of Japan'
Publication date: 01/01/2019
Field of study

Crossref

Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition

Author: Akinori ITO
Ryo MASUMURA
Sumitaka SAKAUCHI
Taichi ASAMI
Takanobu OBA
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date
Field of study

Crossref