Search CORE

22 research outputs found

Development of a large spontaneous speech database of agglutinative Hungarian language

Author: Beke András
Gráczi Tekla Etelka
Gyarmathy Dorottya
Gósy Mária
Horváth Viktória
Neuberger Tilda
Publication venue
Publication date: 01/01/2014
Field of study

Repository of the Academy's Library

The realization of silent pauses with regard to age in conversations and narratives

Author: Gyarmathy Dorottya
Huszár Anna
Publication venue
Publication date: 08/12/2019
Field of study

Repository of the Academy's Library

Pausing strategies with regard to speech style

Author: Gyarmathy Dorottya
Horváth Viktória
Publication venue: 'Eotvos Lorand University (ELTE)'
Publication date: 01/01/2019
Field of study

Speech is occasionally interrupted by silent and filled pauses of various length. Pauses have many different functions in spontaneous speech (e.g. breathing, marking syntactic boundaries as well as speech planning difficulties, time for self-repair). The aim of the study was the analysis of the interrelation between the temporal pattern and the syntactical position of silent pauses (SP) on one hand. On the other hand, filled pauses (FP) were also analyzed according to their phonetic realization, as well as the combination of SPs and FPs. The effect of speech style on pausing strategies was also analyzed. A narrative recording and a conversational recording from 10 speakers (ages between 20 and 35 years, 5 male, 5 female) were selected from Hungarian Spontaneous Speech Database for the study. The material was manually annotated, silent pauses were categorized, then the duration of pauses were extracted. Results showed that the position of silent and filled pauses affects their duration. The speech style did not influenced the frequency of pauses. However, silent and filled pauses were longer in narratives than in conversations. Results suggest that pausing strategies are similar in general; however, the timing patterns of pauses may depend on various factors, e.g. speech style

Crossref

Repository of the Academy's Library

ELTE Digital Institutional Repository (EDIT)

GABI – Hungarian Child Language and Speech Database and Information Repository

Author: Bóna Judit
Vakula Tímea
Váradi Viola
Publication venue
Publication date: 01/01/2019
Field of study

ELTE Digital Institutional Repository (EDIT)

Dynamic changes of pausing in triadic conversations

Author: Gyarmathy Dorottya
Horváth Viktória
Huszár Anna
Krepsz Valéria
Publication venue
Publication date: 01/01/2021
Field of study

Repository of the Academy's Library

Kriminalisztikai alapú beszélőiprofil-alkotás

Author: Beke András
Publication venue
Publication date: 01/01/2018
Field of study

A beszélő hangja alapján az ismert személyek felismerése mellett képesek vagyunk az ismeretlen személyekről profilt készíteni, vagyis olyan általános információkat becsülni, mint például a nem, az életkor, a testalkat vagy a beszélő hangulata. Korábbi kutatások igazolták, hogy erős összefüggés van a toldalékcső hossza és a beszélő személy fizikai állapota, mint az életkor, a nem, a testmagasság stb. között. Ezen összefüggés alapján feltételezzük, hogy az emberi beszéd akusztikai jellemzői kódolják az adott beszélő testi fizikai felépítésére utaló jegyeket. A jelen kutatásban ezen összefüggés érvényességét vizsgáljuk tanuló algoritmusok segítségével. A kutatásban elemezzük, hogy a beszédből milyen eredményességgel lehet automatikusan becsülni a beszélő nemét, életkorát, testsúlyát, illetve testtömegét. A fizikai tulajdonságok becsléséhez a beszédből kinyert akusztikai jellemzőket használunk: prozódiai alapú, beszédminőség-alapú, spektrális alapú. Az eredmények azt mutatják, hogy a nem, a testtömeg és a testsúly becslése nagy pontosságú, míg az életkor becslése kevésbé

University of Szeged

Effects of language mismatch in automatic forensic voice comparison using deep learning embeddings

Author: Fejes Attila
Sztahó Dávid
Publication venue: 'Wiley'
Publication date: 26/09/2022
Field of study

In forensic voice comparison the speaker embedding has become widely popular in the last 10 years. Most of the pretrained speaker embeddings are trained on English corpora, because it is easily accessible. Thus, language dependency can be an important factor in automatic forensic voice comparison, especially when the target language is linguistically very different. There are numerous commercial systems available, but their models are mainly trained on a different language (mostly English) than the target language. In the case of a low-resource language, developing a corpus for forensic purposes containing enough speakers to train deep learning models is costly. This study aims to investigate whether a model pre-trained on English corpus can be used on a target low-resource language (here, Hungarian), different from the model is trained on. Also, often multiple samples are not available from the offender (unknown speaker). Therefore, samples are compared pairwise with and without speaker enrollment for suspect (known) speakers. Two corpora are applied that were developed especially for forensic purposes, and a third that is meant for traditional speaker verification. Two deep learning based speaker embedding vector extraction methods are used: the x-vector and ECAPA-TDNN. Speaker verification was evaluated in the likelihood-ratio framework. A comparison is made between the language combinations (modeling, LR calibration, evaluation). The results were evaluated by minCllr and EER metrics. It was found that the model pre-trained on a different language but on a corpus with a huge amount of speakers performs well on samples with language mismatch. The effect of sample durations and speaking styles were also examined. It was found that the longer the duration of the sample in question the better the performance is. Also, there is no real difference if various speaking styles are applied

arXiv.org e-Print Archive

Kisiskolás gyermekek spontán beszédének jellemzői

Author: Horváth Viktória
Publication venue: Magyar Tudományos Akadémia
Publication date: 01/01/2016
Field of study

Repository of the Academy's Library

Az f0-jellemzők felolvasásban és spontán beszédben

Author: Gráczi Tekla Etelka
Huszár Anna
Krepsz Valéria
Markó Alexandra
Száraz Bettina
Publication venue: Pannon Egyetem
Publication date: 01/01/2019
Field of study

A large number of studies investigated the differences in f0 characteristics between reading aloud (RA) and spontaneous speech (SpS) in various languages. Their basic assumption was that the different production strategies lead to difference in the prosodic features, however, their results were not consistent as to which speech style was realized with higher mean f0 or a larger f0 range. Hungarian data have been only analyzed on small numbers of speakers. Therefore our goals are: (i) to provide a large sample (82 subjects) based comparison of the f0 characteristics of RA and SpS in Hungarian, and (ii) to analyze the individual differences behind general tendencies. Mean f0 and pitch range (of the interpausal units) were higher in RA, while the f0 range in SpS. The interspeaker differences played an important role in the mean f0 results: no speech style characteristic difference was found in women, while this was apparent in men

Repository of the Academy's Library