35,107 research outputs found
An Empirical Evaluation of Zero Resource Acoustic Unit Discovery
Acoustic unit discovery (AUD) is a process of automatically identifying a
categorical acoustic unit inventory from speech and producing corresponding
acoustic unit tokenizations. AUD provides an important avenue for unsupervised
acoustic model training in a zero resource setting where expert-provided
linguistic knowledge and transcribed speech are unavailable. Therefore, to
further facilitate zero-resource AUD process, in this paper, we demonstrate
acoustic feature representations can be significantly improved by (i)
performing linear discriminant analysis (LDA) in an unsupervised self-trained
fashion, and (ii) leveraging resources of other languages through building a
multilingual bottleneck (BN) feature extractor to give effective cross-lingual
generalization. Moreover, we perform comprehensive evaluations of AUD efficacy
on multiple downstream speech applications, and their correlated performance
suggests that AUD evaluations are feasible using different alternative language
resources when only a subset of these evaluation resources can be available in
typical zero resource applications.Comment: 5 pages, 1 figure; Accepted for publication at ICASSP 201
A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition
We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for improving speaker independence in the absence of supervision, and evaluate the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations. Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.5 page(s
Unsupervised Spoken Term Detection with Spoken Queries by Multi-level Acoustic Patterns with Varying Model Granularity
This paper presents a new approach for unsupervised Spoken Term Detection
with spoken queries using multiple sets of acoustic patterns automatically
discovered from the target corpus. The different pattern HMM
configurations(number of states per model, number of distinct models, number of
Gaussians per state)form a three-dimensional model granularity space. Different
sets of acoustic patterns automatically discovered on different points properly
distributed over this three-dimensional space are complementary to one another,
thus can jointly capture the characteristics of the spoken terms. By
representing the spoken content and spoken query as sequences of acoustic
patterns, a series of approaches for matching the pattern index sequences while
considering the signal variations are developed. In this way, not only the
on-line computation load can be reduced, but the signal distributions caused by
different speakers and acoustic conditions can be reasonably taken care of. The
results indicate that this approach significantly outperformed the unsupervised
feature-based DTW baseline by 16.16\% in mean average precision on the TIMIT
corpus.Comment: Accepted by ICASSP 201
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop
We summarize the accomplishments of a multi-disciplinary workshop exploring
the computational and scientific issues surrounding the discovery of linguistic
units (subwords and words) in a language without orthography. We study the
replacement of orthographic transcriptions by images and/or translated text in
a well-resourced language to help unsupervised discovery from raw speech.Comment: Accepted to ICASSP 201
- …