22,388 research outputs found
Bayesian Models for Unit Discovery on a Very Low Resource Language
Developing speech technologies for low-resource languages has become a very
active research field over the last decade. Among others, Bayesian models have
shown some promising results on artificial examples but still lack of in situ
experiments. Our work applies state-of-the-art Bayesian models to unsupervised
Acoustic Unit Discovery (AUD) in a real low-resource language scenario. We also
show that Bayesian models can naturally integrate information from other
resourceful languages by means of informative prior leading to more consistent
discovered units. Finally, discovered acoustic units are used, either as the
1-best sequence or as a lattice, to perform word segmentation. Word
segmentation results show that this Bayesian approach clearly outperforms a
Segmental-DTW baseline on the same corpus.Comment: Accepted to ICASSP 201
Bayesian models for unit discovery on a very low resource language
Accepted to ICASSP 2018International audienceDeveloping speech technologies for low-resource languages has become a very active research field over the last decade. Among others, Bayesian models have shown some promising results on artificial examples but still lack of in situ experiments. Our work applies state-of-the-art Bayesian models to unsupervised Acoustic Unit Discovery (AUD) in a real low-resource language scenario. We also show that Bayesian models can naturally integrate information from other resourceful languages by means of informative prior leading to more consistent discovered units. Finally, discovered acoustic units are used, either as the 1-best sequence or as a lattice, to perform word segmentation. Word segmentation results show that this Bayesian approach clearly outperforms a Segmental-DTW baseline on the same corpus
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop
We summarize the accomplishments of a multi-disciplinary workshop exploring
the computational and scientific issues surrounding the discovery of linguistic
units (subwords and words) in a language without orthography. We study the
replacement of orthographic transcriptions by images and/or translated text in
a well-resourced language to help unsupervised discovery from raw speech.Comment: Accepted to ICASSP 201
- âŠ