Search CORE

6 research outputs found

Voiceless Stop Consonant Modelling and Synthesis Framework Based on MISO Dynamic System

Author: Korvel G.
Kostek B.
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2017
Field of study

A voiceless stop consonant phoneme modelling and synthesis framework based on a phoneme modeling in low-frequency range and high-frequency range separately is proposed. The phoneme signal is decomposed into the sums of simpler basic components and described as the output of a linear multiple-input and single-output (MISO) system. The impulse response of each channel is a third order quasi-polynomial. Using this framework, the limit between the frequency ranges is determined. A new limit point searching three-step algorithm is given in this paper. Within this framework, the input of the low-frequency component is equal to one, and the impulse response generates the whole component. The high-frequency component appears when the system is excited by semi-periodic impulses. The filter impulse response of this component model is single period and decays after three periods. Application of the proposed modelling framework for the voiceless stop consonant phoneme has shown that the quality of the model is sufficiently good

Biblioteka Nauki - repozytorium artykuÅÃ³w

Overview of speech synthesis using LSTM neural networks

Author: Bernataviciene J.
Korvel G.
Navickas G.
Publication venue: Minsk : BSU
Publication date
Field of study

Currently, the most popular speech recognition systems are based on unit selection - decision tree algorithms. In literature, new speech synthesis methods based on Recurrent Neural Networks (RNN) and Long Short Term Memory (LSTM) are proposed. In this paper, an overview of speech synthesis and their realization called LSTM is given. Directions for further investigations are high-lighte

BSU Digital Library

Overview of speech synthesis using LSTM neural networks

Author: Bernataviciene J.
Korvel G.
Navickas G.
Publication venue: Minsk : BSU
Publication date
Field of study

Regcording, parameterization and classification of allophones employing bimodal approach

Author: Cygert S.
Czyżewski A.
Korvel G.
Szwoch G.
Zaporowski S.
Publication venue: Politechnika Gdańska. Wydział Elektrotechniki i Automatyki
Publication date: 01/01/2018
Field of study

Praca dotyczy rejestracji i parametryzacji alofonów w języku angielskim z wykorzystaniem dwóch modalności. W badaniach dokonano rejestracji wypowiedzi w języku angielskim mówców, których znajomość tego języka odpowiada poziomowi rodowitego mówcy. W kolejnym etapie wyodrębnione zostały alofony z nagrań fonicznych i odpowiadające im sygnały wizyjne. W procesie tworzenia wektorów cech wykorzystano odrębne systemy parametryzacji, osobne dla każdej modalności. Do parametryzacji sygnału fonicznego użyto typowych deskryptorów stosowanych w obszarze rozpoznawania mowy i muzyki. W nagraniach z systemu przechwytywania ruchu zaproponowano własne rozwiązania. Do klasyfikacji alofonów wykorzystano sieci neuronowe oraz maszynę wektorów nośnych w podejściu jednoi dwumodalnym. Stwierdzono, że skuteczność rozpoznawania wzrasta wraz z wykorzystaniem więcej niż jednej modalności.The paper concerns the recording and parameterization of allophones in English using two modalities. In the research, the English speakers' statements were recorded. Those speakers’s language proficiency corresponds to the level of the native speaker. In the next stage, allophones from audio recordings and corresponding visual signals were isolated. In the process of creating feature vectors, separate parameterization systems were used for each modality. For the audio signal parameterization, typical descriptors used in the area of speech and music recognition were chosen. In the case of the motion capture system own solutions were proposed. For the purpose of allophones classification, neural networks and the suport vector machine were used in both approaches. It has been found that the recognition efficiency increases with the use of more than one modality

Biblioteka Nauki - repozytorium artykuÅÃ³w

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

Author: Babii A
Berend G
Calixto I
Ciprian-Truică O
Elena-Apostol S
Erdem A
Erdem E
Frank A
Gatt A
Korvel G
Kuyu M
Lloret E
Martinčić-Ipšic S
Plank B
Pârcălăbescu L
rih B
Turuta O
Yagcioglu S
Š
Publication venue
Publication date: 01/01/2022
Field of study

Developing artificial learning systems that can understand and generate natural language has been one of the long-standing goals of artificial intelligence. Recent decades have witnessed an impressive progress on both of these problems, giving rise to a new family of approaches. Especially, the advances in deep learning over the past couple of years have led to neural approaches to natural language generation (NLG). These methods combine generative language learning techniques with neural-networks based frameworks. With a wide range of applications in natural language processing, neural NLG (NNLG) is a new and fast growing field of research. In this state-of-the-art report, we investigate the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies. We summarize the fundamental building blocks of NNLG approaches from these aspects and provide detailed reviews of commonly used preprocessing steps and basic neural architectures. This report also focuses on the seminal applications of these NNLG models such as machine translation, description generation, automatic speech recognition, abstractive summarization, text simplification, question answering and generation, and dialogue generation. Finally, we conclude with a thorough discussion of the described frameworks by pointing out some open research directions

Utrecht University Repository

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

Author: Babii A
Berend G
Calixto I
Ciprian-Truică O
Elena-Apostol S
Erdem A
Erdem E
Frank A
Gatt A
Korvel G
Kuyu M
Lloret E
Martinčić-Ipšic S
Natural Language Processing
Plank B
Pârcălăbescu L
rih B
Sub Natural Language Processing
Turuta O
Yagcioglu S
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2022
Field of study

Utrecht University Repository