Search CORE

17 research outputs found

Recommended from our members

Advanced Signal Processing and Adaptive Learning Methods

Author: Delić Vlado
Pokrajac David
Stamenković Zoran
Publication venue: New York, NY [u.a.] : Hindawi Publ. Corp.
Publication date: 01/01/2019
Field of study

[No abstract available

Repositorium für Naturwissenschaften und Technik

RELEVANCE OF THE TYPES AND THE STATISTICAL PROPERTIES OF FEATURES IN THE RECOGNITION OF BASIC EMOTIONS IN SPEECH

Author: Bojanić Milana
Delić Vlado
Sečujski Milan
Publication venue: Published by the University of Niš, Serbia
Publication date: 17/05/2014
Field of study

Due to the advance of speech technologies and their increasing usage in various applications, automatic recognition of emotions in speech represents one of the emerging fields in human-computer interaction. This paper deals with several topics related to automatic emotional speech recognition, most notably with the improvement of recognition accuracy by lowering the dimensionality of the feature space and evaluation of the relevance of particular feature types. The research is focused on the classification of emotional speech into five basic emotional classes (anger, joy, fear, sadness and neutral speech) using a recorded corpus of emotional speech in Serbian

University of Niš: Facta Universitatis (E-Journals) / Универзитет у Нишу

Word recognition in speach audiometry

Author: Delić Vlado
Marković Maja
Sokolovac Ivana
Suzić Siniša
Živanović Aleksandar
Publication venue: Univerzitet u Beogradu - Fakultet za specijalnu edukaciju i rehabilitaciju
Publication date: 01/01/2020
Field of study

It was noticed that the standard set of words used for speech audiometry contained some archaic words, as well as words which were much more difficult to understand out of context. The first aim of this paper is to determine the words which are significantly easier or more difficult to recognize than the rest in speech audiometry at the ENT Clinic in Novi Sad (we have dedicated more attention to the incorrectly recognized words), as well as their distribution across sets containing 10 words which are used during one measurement. The second aim of the paper is to account for the errors from the point of view of linguistics and medicine. The results that we have analyzed belong to different intensity levels (5-80 dB and 25-40 dB). The research participants were 66 patients suffering from multiple sclerosis. The study has shown that there are 14 words (out of 160) whose recognition accuracy is significantly worse than that of the other words in their 10-word group. Most poorly recognized words constitute minimal pairs with some other words, and most of these words contains plosives. Even though consonants cause a higher number of errors, hearing-impaired patients sometimes misunderstand and therefore mispronounced vowel segments as well, e.g. the vowel /i/ is replaced with the vowel /u/. Another important factor which influences perception is the part of speech – nouns, adjectives and adverbs are identified more easily that other parts of speech.Uočeno je da se u standardnom setu reči za govornu audiometriju nalaze neke arhaične reči, kao i reči koje je mnogo teže razumeti bez konteksta. Prvi cilj ovog rada jeste da odredi reči koje se znatno lak- še ili teže prepoznaju od ostalih u govornoj audiometriji na ORL klinici u Novom Sadu (veću pažnju posvetili smo pogrešno prepozna- vanim rečima), kao i njihovu raspodelu po setovima od po 10 reči koje se koriste pri jednom merenju. Drugi cilj rada je da objasni greške sa lingvističkog i medicinskog aspekta. Rezultati koje smo analizira- li pripadaju različitim nivoima intenziteta (5-80 dB i 25-40 dB). U istraživanju je učestvovalo 66 pacijenata obolelih od multiple skle- roze. Istraživanje je pokazalo da postoji 14 reči (od ukupno 160) čija je tačnost prepoznavanja znatno lošija od tačnosti drugih reči u nji- hovoj grupi od 10 reči. Većina loše prepoznavanih reči čini minimalne parove sa nekim drugim rečima, a većina ovih reči sadrži plozive. Iako konsonanti uzrokuju veći broj grešaka, ispitanici sa oštećenjem slu- ha mogu pogrešno razumeti pa izgovoriti i vokalske segmente, npr. vokal /i/ zamenjuju vokalom /u/. Još jedan bitan faktor koji utiče na percepciju jeste vrsta reči – imenice, pridevi i prilozi identifikuju se lakše od drugih vrsta reči

rFASPER - Repozitorijum Fakulteta za specijalnu edukaciju i rehabilitaciju

rFASPER - Repository of the Faculty of Special Education and Rehabilitation

Cross-Lingual Neural Network Speech Synthesis Based on Multiple Embeddings

Author: Delić Vlado D.
Nosek Tijana V.
Obradović Radovan J.
Pekar Darko J.
Sečujski Milan S.
Suzić Siniša B.
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 11/05/2022
Field of study

The paper presents a novel architecture and method for speech synthesis in multiple languages, in voices of multiple speakers and in multiple speaking styles, even in cases when speech from a particular speaker in the target language was not present in the training data. The method is based on the application of neural network embedding to combinations of speaker and style IDs, but also to phones in particular phonetic contexts, without any prior linguistic knowledge on their phonetic properties. This enables the network not only to efficiently capture similarities and differences between speakers and speaking styles, but to establish appropriate relationships between phones belonging to different languages, and ultimately to produce synthetic speech in the voice of a certain speaker in a language that he/she has never spoken. The validity of the proposed approach has been confirmed through experiments with models trained on speech corpora of American English and Mexican Spanish. It has also been shown that the proposed approach supports the use of neural vocoders, i.e. that they are able to produce synthesized speech of good quality even in languages that they were not trained on

Re-UNIR

USER-AWARENESS AND ADAPTATION IN CONVERSATIONAL AGENTS

Author: Bojanić Milana
Delić Vlado
Gnjatović Milan
Jakovljević Nikša
Jokić Ivan
Popović Branislav
Publication venue: Published by the University of Niš, Serbia
Publication date: 13/06/2014
Field of study

This paper considers the research question of developing user-aware and adaptive conversational agents. The conversational agent is a system which is user-aware to the extent that it recognizes the user identity and his/her emotional states that are relevant in a given interaction domain. The conversational agent is user-adaptive to the extent that it dynamically adapts its dialogue behavior according to the user and his/her emotional state. The paper summarizes some aspects of our previous work and presents work-in-progress in the field of speech-based human-machine interaction. It focuses particularly on the development of speech recognition modules in cooperation with both modules for emotion recognition and speaker recognition, as well as the dialogue management module. Finally, it proposes an architecture of a conversational agent that integrates those modules and improves each of them based on some kind of synergies among themselves

University of Niš: Facta Universitatis (E-Journals) / Универзитет у Нишу

Influence of genotype, year and locations on yield, oil and protein content of soybean - Glycine max (L.) Merr.

Author: Goran Jukić
Ivan Varnica
Ivana Rukavina
Ivica Delić
Krunoslav Dugalić
Vlado Guberac
Publication venue: 'Faculty of Agrobiotechnical Sciences Osijek'
Publication date: 01/01/2019
Field of study

Tijekom 2017. i 2018. godine provedena su poljska istraživanja o utjecaju genotipa, godine, lokacije te interakcija na prinos, udio ulja i proteina u soji. U pokus je uvršteno dvadeset najzastupljenijih genotipova soje koji u strukturi sjetve zauzimaju 75% sjetvenih površina. Pokus je postavljen na lokacijama Osijek i Kutjevo u dva ponavljanja po slučajnom blok-rasporedu. U 2018. godini ostvaren je prosječno veći prinos zrna, udio ulja i proteina prvenstveno zbog pravilnog rasporeda oborina. Lokacija Osijek u svim godinama istraživanja imala je prosječno veće prinose zrna, udio ulja i proteina. Prema dobivenim rezultatima analize varijance za genotip, interakciju genotipa x lokacija i genotip x godina dobivene su statistički visoko opravdane razlike (P<0,01) za prinos zrna. Za genotip i interakciju genotip x godina dobivene su statistički opravdane razlike (P<0,05) za udio ulja i proteina. Dobiveni rezultati istraživanja doprinijet će pravilnom izboru genotipova ovisno o namjeni proizvodnje kako bi se iskoristio genetski potencijal genotipa koji je najpogodniji za određenu lokaciju.During the years 2017 and 2018, field studies were carried out on the impact of year, location and interaction of genotype x location and genotype x year on yield, oil and protein content of soybean seed. The experiment included twenty most common soybean genotypes of different maturation groups, which have a 75% share in the sowing structure. The experiment was set up on locations Osijek and Kutjevo, in two repetitions, in a randomized block design. In 2018, an average higher seed yield, oil content and protein content were achieved primarily due to proper distribution of rainfall. Location Osijek in years of research had an average higher seed yield, and oil and protein content. According to the obtained results of the variance analysis for enotype, interaction of genotype x location and genotype x year, statistically highly justified differences (P<0.01) were obtained for seed yield. For genotype and interaction of genotype x year, statistically justified differences (P<0.05) were obtained for the oil and protein content. Research results will contribute to the proper selection of genotypes depending on the purpose of production in order to exploit the genetic potential of the genotype which is most suitable for a particular location

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Humanoid robot Marko - an assistant in therapy for children

Author: Borovac Branislav
Delić Vlado
Gnjatović Milan
Karan Branko
Mišković Dragiša
Nikolić Milutin
Penčić Marko
Raković Mirko
Savić Srđan
Taševski Jovica
Publication venue: Belgrade : Institute for Research and Design in Commerce and Industry
Publication date: 01/01/2014
Field of study

This paper reports on work in progress towards development of a robot to be used as assistive technology in treatment of children with developmental disorders (cerebral palsy). This work integrates two activities. The first one is mechanical device design (humanoid robot) of sufficient capabilities for demonstration of therapeutical exercises for habilitation of gross and fine motor functions and for acquiring spatial relationships. The second one is design of appropriate communication capabilities of the robot. The basic therapeutical role of the robot is to motivate children to practice therapy harder and longer. To achieve this, robot must fulfil two requirements: it must have appropriate appearance to be able to establish affective attachment of the child to the robot, and must be able to communicate with children verbally (speech recognition and synthesis,) and non-verbally (facial expressions, gestures...). Thus, conversational abilities are unavoidable and among the most important capabilities. In short, robot should be able to manage three-party natural language conversation – between the child, the therapist and the robot – in clinical settings

Serbian Academy of Science and Arts Digital Archive (DAIS)

Automatic Emotion Recognition in Speech: Possibilities and Significance

Author: Milana Bojanić
Vlado Delić
Publication venue: University of Banja Luka
Publication date: 01/12/2009
Field of study

Automatic speech recognition and spoken language understanding are crucial steps towards a natural humanmachine interaction. The main task of the speech communication process is the recognition of the word sequence, but the recognition of prosody, emotion and stress tags may be of particular importance as well. This paper discusses thepossibilities of recognition emotion from speech signal in order to improve ASR, and also provides the analysis of acoustic features that can be used for the detection of speaker’s emotion and stress. The paper also provides a short overview of emotion and stress classification techniques. The importance and place of emotional speech recognition is shown in the domain of human-computer interactive systems and transaction communication model. The directions for future work are given at the end of this work

Directory of Open Access Journals

UDC 621.391:004.4, DOI:10.2298/CSIS090710007B QoS Testing In a Live Private IP MPLS Network with CoS Implemented

Author: Emil Šećerov
Vlado Delić
Živko Bojović
Publication venue
Publication date
Field of study

Abstract. This paper describes a testing conducted on a private IP/MPLS network of a Telecom operator during service introduction. We have applied DiffServ and E-LSP policies for bandwidth allocation for predefined classes of service (voice, video, data and VPN). We used a traffic generator to create the worst possible situations during the testing, and measured QoS for individual services. UML considerations about NGN structure and packet networks traffic testing are also presented using the deployment, class and state diagrams. Testing results are given in tabular and graphical forms, and the conclusions derived will be subsequently used as a basis for defining the stochastic traffic generator/simulator

CiteSeerX

A comparison of multi-style DNN-based TTS approaches using small datasets

Author: Delić Tijana
Delić Vlado
Jovanović Vladimir
Pekar Darko
Sečujski Milan
Suzić Siniša
Publication venue: 'EDP Sciences'
Publication date: 01/01/2018
Field of study

Studies have shown that people already perceive the interaction with computers, robots and media in the same way as they perceive social communication with other people. For that reason it is critical for a high-quality text-to-speech system (TTS) to sound as human-like as possible. However, a major obstacle in creating expressive TTS voices is that the amount of style-specific speech needed for training such a system is often not sufficient. This paper presents a comparison between different approaches to multi-style TTS, with focus on cases when only a small dataset per style is available. The described approaches have been originally proposed for efficient modelling of multiple speakers with a limited amount of data per speaker. Among the suggested approaches the approach based on style codes has emerged as the best, regardless of the target speech style

Directory of Open Access Journals