Search CORE

52 research outputs found

Applying DNN Adaptation to Reduce the Session Dependency of Ultrasound Tongue Imaging-based Silent Speech Interfaces

Author: Csapó Tamás Gábor
Gosztolya Gábor
Grósz Tamás
Markó Alexandra
Tóth László
Publication venue: 'Obuda University'
Publication date: 01/01/2020
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Mély neuronhálós beszédfelismerők GMM-mentes tanítása

Author: Gosztolya Gábor
Grósz Tamás
Tóth László
Publication venue
Publication date: 01/01/2017
Field of study

Az utóbbi pár évben a beszédfelismerőkben használt rejtett Markov modellekben (hidden Markov model, HMM) az ún. Gauss-keverékmodell (gaussian mixture model, GMM) komponenst leváltották a mély neuronhálók (deep neural network, DNN). Ugyanakkor ezek az új, neuronálókra épülő hibrid HMM/DNN felismerők számos olyan algoritmust megörököltek, melyeket eredetileg GMM-alapú rendszerekhez fejlesztettek ki, és így optimalitásuk az új környezetben nem garantált. A HMM/DNN modellek `GMM-mentes' tanításához két részfeladatra kell új megoldást adnunk. Az egyik, hogy a mély hálók időben illesztett tanító ímkéket igényelnek, a másik pedig a környezetfüggő állapotok előállítása, amelyre a klasszikus megoldás egy GMM-alapú klaszterezési algoritmus. Bár a HMM/DNN hibridek tanítására léteznek teljes mondatokon dolgozó ún. szekven ia-diszkriminatív tanítóalgoritmusok, ezeket jellemzően sak a tanítás legutolsó fázisában, a modellek �nomhangolására szokták bevetni, míg a tanítás elején HMM/GMM modellekel el®állított és illesztett ímkékből indulnak ki. Jelen ikkünkben viszont megmutatjuk, hogy megfelelő oda�gyeléssel a szekven iatanuló algoritmusok a tanítás legelejétől használhatóak. Az állapotklaszterezési lépésre korábban már javasoltunk egy GMM-mentes megoldást, így a ímkeillesztési feladat megoldásával egy teljesen GMM-mentes tanítási sémához jutottunk. Kísérleti eredményeink azt mutatják, hogy a javasolt megoldás nem sak gyorsabb, mint a hagyományos tanítási módszer, hanem valamivel jobb felismerési pontosságot is eredményez

University of Szeged

Beszédfelismerők mély neuronhálós állapotkapcsolási algoritmusainak kísérleti összehasonlítása

Author: Gosztolya Gábor
Grósz Tamás
Tóth László
Publication venue: Szegedi Tudományegyetem Informatikai Tanszékcsoport
Publication date: 01/01/2018
Field of study

University of Szeged

Repository of the Academy's Library

Mély neuronhálós beszédfelismerők GMM-mentes tanítása

Author: Gosztolya Gábor
Grósz Tamás
Tóth László
Publication venue: Szegedi Tudományegyetem Informatikai Tanszékcsoport
Publication date: 01/01/2017
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

University of Szeged

Repository of the Academy's Library

General Utterance-Level Feature Extraction for Classifying Crying Sounds, Atypical & Self-Assessed Affect and Heart Beats

Author: Gosztolya Gábor
Grósz Tamás
Tóth László
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2018
Field of study

Crossref

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Repository of the Academy's Library

Biológiai vízminőségi paraméterek hatásainak vizsgálata a szarvasi Holt-Körösön

Author: Grósz János
Halupka Gábor Ernő
Waltner István
Publication venue
Publication date: 01/01/2023
Field of study

One of the essential natural resources is water resource in Hungary. It is threatened by many threats, both quantitative and qualitative. For these reasons, the protection of surface water and groundwater is a priority. The main objective of the present study is to determine which parameters have the greatest influence on the horizontal and vertical distribution patterns of phytoplankton in the water body. The study area is the Holt-Körös, which plays an important ecological, social, economic and recreational role in the region. Between the field and laboratory tests, the following physical, chemical and biological water quality parameters were investigated: water temperature, UV radiation index, Secchi depth, underwater light condition, dissolved oxygen content, suspended solids content, chlorophyll-a content, Fe, NO2-, NO3- NH4+, PO43-, Na+, K+, Mg2+, and pH value. A magyarországi viszonyokat vizsgálva azt lehet mondani, hogy az egyik legjelentősebb természeti erőforrása a vízkészlet, amelyet sok veszély fenyeget, mint mennyiségi, mint minőségi oldal-ról. A Víz Keretirányelv előírásai szerint, az Európai Unió tagállamaiban 2015 végéig jó állapotba kellett hozni minden olyan felszíni és felszín alatti vizet, amelyek esetében ez lehetséges volt, valamint a to-vábbiakban a jó állapotot fenn kell tartani és a vizek állapotromlását meg kell előzni. Ezen érvekből kiindulva, a felszíni, illetve a felszín alatti vizek védelme kiemelt fontosságú feladatok közé tartozik. Jelen kutatásunk fő célkitűzése, hogy a vizsgáltok alapján meghatározzuk mely paraméterek befolyá-solják leginkább a fitoplankton állomány horizontális és vertikális eloszlási mintázatát a víztestben. Az alkalmazott mintaterület a szarvasi Holt-Körös, amely igen jelentős ökológiai, társadalmi, gazdaságai és rekreációs szerepet tölt be a térségben. A helyszíni és a laboratóriumi vizsgálatok között az alábbi fizikai, kémiai és biológiai vízminőségi paramétereket vizsgáltuk: vízhőmérséklet, UV sugárzás index, secchi mélység, vízi alatti fényklíma, oldott oxigén tartalom, lebegőanyag tartalom, klorofill-a tartalom, Fe, NO2-, NO3-NH4+, PO43-, Na+, K+, Mg2+, pH érték

Repository of the Academy's Library

Ultrasound-Based Silent Speech Interface Built on a Continuous Vocoder

Author: Al-Radhi Mohammed Salah
Csapó Tamás Gábor
Gosztolya Gábor
Grósz Tamás
Markó Alexandra
Németh Géza
Tóth László
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2019
Field of study

Recently it was shown that within the Silent Speech Interface (SSI) field, the prediction of F0 is possible from Ultrasound Tongue Images (UTI) as the articulatory input, using Deep Neural Networks for articulatory-to-acoustic mapping. Moreover, text-to-speech synthesizers were shown to produce higher quality speech when using a continuous pitch estimate, which takes non-zero pitch values even when voicing is not present. Therefore, in this paper on UTI-based SSI, we use a simple continuous F0 tracker which does not apply a strict voiced / unvoiced decision. Continuous vocoder parameters (ContF0, Maximum Voiced Frequency and Mel-Generalized Cepstrum) are predicted using a convolutional neural network, with UTI as input. The results demonstrate that during the articulatory-to-acoustic mapping experiments, the continuous F0 is predicted with lower error, and the continuous vocoder produces slightly more natural synthesized speech than the baseline vocoder using standard discontinuous F0.Comment: 5 pages, 3 figures, accepted for publication at Interspeech 201

arXiv.org e-Print Archive

Crossref

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

Repository of the Academy's Library

A new approach to the determination of the uncertainty in neutron diffraction experiments with isotopic substitution method

Author: Bakó Imre
Bálint Szabolcs
Grósz Tamás
Pálinkás Gábor
Radnai Tamás
Tóth Gergely
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

N eutron diffraction experiment with isotopically substituted substances is a powerful approach claiming to yield unambiguous information about the local atomic structure in disordered materials. This information is expressed in the partial structure factors , and extracting them from a series of measurements requires solution of a set of linear equations that is affected by experimental errors. In this article, we suggest a method for the determination of the optimal set of H/D compositions with or without ta king into account the experimental error. For the case of water, our investigations show that the selection of the isotope concentrations and the distribution of measurement time among the various samples have critical role if one wants to utilize the limi ted neutron beam time efficiently. It is well known that measurements of pure H 2 O introduce fairly large errors in the partial structure factors due to its very strong incoherent scattering. On water and methanol as examples, we investigated the propagatio n of random errors to the partial structure factors using partial pair - correlation functions from molecular dynamics simulation. It is shown on the example of water that it is not worthwhile measur ing pure H 2 O

Crossref

Repository of the Academy's Library