Search CORE

208 research outputs found

3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement

Author: Chen Qian
Chen Yafeng
Cheng Luyao
Wang Hui
Zheng Siqi
Publication venue
Publication date: 27/06/2023
Field of study

Disentangling uncorrelated information in speech utterances is a crucial research topic within speech community. Different speech-related tasks focus on extracting distinct speech representations while minimizing the affects of other uncorrelated information. We present a large-scale speech corpus to facilitate the research of speech representation disentanglement. 3D-Speaker contains over 10,000 speakers, each of whom are simultaneously recorded by multiple Devices, locating at different Distances, and some speakers are speaking multiple Dialects. The controlled combinations of multi-dimensional audio data yield a matrix of a diverse blend of speech representation entanglement, thereby motivating intriguing methods to untangle them. The multi-domain nature of 3D-Speaker also makes it a suitable resource to evaluate large universal speech models and experiment methods of out-of-domain learning and self-supervised learning. https://3dspeaker.github.io

arXiv.org e-Print Archive

Correlating cepstra with formant frequencies: : implications for phonetically-informed forensic voice comparison

Author: Clermont Frantz
Harrison Philip
Hughes Vincent
Publication venue: 'International Speech Communication Association'
Publication date: 23/10/2020
Field of study

Crossref

White Rose Research Online

Towards robust paralinguistic assessment for real-world mobile health (mHealth) monitoring: an initial study of reverberation effects on speech

Author: Carr Ewan
Cummins Nicholas
Dineley Judith
Dobson Richard
Downs Johnny
Matcham Faith
Quatieri Thomas F.
Publication venue
Publication date: 15/08/2023
Field of study

King's Research Portal