1 research outputs found
Double Articulation Analyzer with Prosody for Unsupervised Word and Phoneme Discovery
Infants acquire words and phonemes from unsegmented speech signals using
segmentation cues, such as distributional, prosodic, and co-occurrence cues.
Many pre-existing computational models that represent the process tend to focus
on distributional or prosodic cues. This paper proposes a nonparametric
Bayesian probabilistic generative model called the prosodic hierarchical
Dirichlet process-hidden language model (Prosodic HDP-HLM). Prosodic HDP-HLM,
an extension of HDP-HLM, considers both prosodic and distributional cues within
a single integrative generative model. We conducted three experiments on
different types of datasets, and demonstrate the validity of the proposed
method. The results show that the Prosodic DAA successfully uses prosodic cues
and outperforms a method that solely uses distributional cues. The main
contributions of this study are as follows: 1) We develop a probabilistic
generative model for time series data including prosody that potentially has a
double articulation structure; 2) We propose the Prosodic DAA by deriving the
inference procedure for Prosodic HDP-HLM and show that Prosodic DAA can
discover words directly from continuous human speech signals using statistical
information and prosodic information in an unsupervised manner; 3) We show that
prosodic cues contribute to word segmentation more in naturally distributed
case words, i.e., they follow Zipf's law.Comment: 11 pages, Submitted to IEEE Transactions on Cognitive and
Developmental System