15 research outputs found
The EASR Corpora of European Portuguese, French, Hungarian and Polish elderly speech
Currently available speech recognisers do not usually work well with elderly speech. This is because several characteristics of speech
(e.g. fundamental frequency, jitter, shimmer and harmonic noise ratio) change with age and because the acoustic models used by speech
recognisers are typically trained with speech collected from younger adults only. To develop speech-driven applications capable of
successfully recognising elderly speech, this type of speech data is needed for training acoustic models from scratch or for adapting
acoustic models trained with younger adults’ speech. However, the availability of suitable elderly speech corpora is still very limited.
This paper describes an ongoing project to design, collect, transcribe and annotate large elderly speech corpora for four European languages: Portuguese, French, Hungarian and Polish. The Portuguese, French and Polish corpora contain read speech only, whereas the
Hungarian corpus also contains spontaneous command and control type of speech. Depending on the language in question, the corpora
contain 76 to 205 hours of speech collected from 328 to 986 speakers aged 60 and over. The final corpora will come with manually
verified orthographic transcriptions, as well as annotations for filled pauses, noises and damaged words.info:eu-repo/semantics/publishedVersio
The EASR Corpora of European Portuguese, French, Hungarian and Polish elderly speech
Currently available speech recognisers do not usually work well with elderly speech. This is because several characteristics of speech
(e.g. fundamental frequency, jitter, shimmer and harmonic noise ratio) change with age and because the acoustic models used by speech
recognisers are typically trained with speech collected from younger adults only. To develop speech-driven applications capable of
successfully recognising elderly speech, this type of speech data is needed for training acoustic models from scratch or for adapting
acoustic models trained with younger adults’ speech. However, the availability of suitable elderly speech corpora is still very limited.
This paper describes an ongoing project to design, collect, transcribe and annotate large elderly speech corpora for four European languages: Portuguese, French, Hungarian and Polish. The Portuguese, French and Polish corpora contain read speech only, whereas the
Hungarian corpus also contains spontaneous command and control type of speech. Depending on the language in question, the corpora
contain 76 to 205 hours of speech collected from 328 to 986 speakers aged 60 and over. The final corpora will come with manually
verified orthographic transcriptions, as well as annotations for filled pauses, noises and damaged words
The EASR Corpora of European Portuguese, French, Hungarian and Polish elderly speech
Currently available speech recognisers do not usually work well with elderly speech. This is because several characteristics of speech
(e.g. fundamental frequency, jitter, shimmer and harmonic noise ratio) change with age and because the acoustic models used by speech
recognisers are typically trained with speech collected from younger adults only. To develop speech-driven applications capable of
successfully recognising elderly speech, this type of speech data is needed for training acoustic models from scratch or for adapting
acoustic models trained with younger adults’ speech. However, the availability of suitable elderly speech corpora is still very limited.
This paper describes an ongoing project to design, collect, transcribe and annotate large elderly speech corpora for four European languages: Portuguese, French, Hungarian and Polish. The Portuguese, French and Polish corpora contain read speech only, whereas the
Hungarian corpus also contains spontaneous command and control type of speech. Depending on the language in question, the corpora
contain 76 to 205 hours of speech collected from 328 to 986 speakers aged 60 and over. The final corpora will come with manually
verified orthographic transcriptions, as well as annotations for filled pauses, noises and damaged words
The EASR Corpora of European Portuguese, French, Hungarian and Polish elderly speech
Currently available speech recognisers do not usually work well with elderly speech. This is because several characteristics of speech
(e.g. fundamental frequency, jitter, shimmer and harmonic noise ratio) change with age and because the acoustic models used by speech
recognisers are typically trained with speech collected from younger adults only. To develop speech-driven applications capable of
successfully recognising elderly speech, this type of speech data is needed for training acoustic models from scratch or for adapting
acoustic models trained with younger adults’ speech. However, the availability of suitable elderly speech corpora is still very limited.
This paper describes an ongoing project to design, collect, transcribe and annotate large elderly speech corpora for four European languages: Portuguese, French, Hungarian and Polish. The Portuguese, French and Polish corpora contain read speech only, whereas the
Hungarian corpus also contains spontaneous command and control type of speech. Depending on the language in question, the corpora
contain 76 to 205 hours of speech collected from 328 to 986 speakers aged 60 and over. The final corpora will come with manually
verified orthographic transcriptions, as well as annotations for filled pauses, noises and damaged words
The EASR Corpora of European Portuguese, French, Hungarian and Polish elderly speech
Currently available speech recognisers do not usually work well with elderly speech. This is because several characteristics of speech
(e.g. fundamental frequency, jitter, shimmer and harmonic noise ratio) change with age and because the acoustic models used by speech
recognisers are typically trained with speech collected from younger adults only. To develop speech-driven applications capable of
successfully recognising elderly speech, this type of speech data is needed for training acoustic models from scratch or for adapting
acoustic models trained with younger adults’ speech. However, the availability of suitable elderly speech corpora is still very limited.
This paper describes an ongoing project to design, collect, transcribe and annotate large elderly speech corpora for four European languages: Portuguese, French, Hungarian and Polish. The Portuguese, French and Polish corpora contain read speech only, whereas the
Hungarian corpus also contains spontaneous command and control type of speech. Depending on the language in question, the corpora
contain 76 to 205 hours of speech collected from 328 to 986 speakers aged 60 and over. The final corpora will come with manually
verified orthographic transcriptions, as well as annotations for filled pauses, noises and damaged words