Search CORE

13 research outputs found

Statistical Language Models for Croatian Weather-domain Corpus

Author: Ipšić Ivo
Martinčić-Ipšić Sanda
Načinović Lucia
Publication venue: Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb
Publication date: 01/11/2009
Field of study

Statistical language modelling estimates the regularities in natural languages. Language models are used in speech recognition, machine translation and other applications for speech and language technologies. In this paper we will present a procedure for language models building for the Croatian weather domain corpus. Different types of n-gram statistic language models and smoothing methods for language modelling are presented. Those models are compared in terms of their estimated perplexity

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Digitalni arhiv Filozofskog fakulteta u Zagrebu

Automatsko raspoznavanje hrvatskoga govora velikoga vokabulara

Author: Ivo Ipšić
Miran Pobar
Sanda Martinčić-Ipšić
Publication venue: KoREMA - Croatian Society for Communications, Computing, Electronics, Measurement and Control
Publication date: 01/01/2011
Field of study

This paper presents procedures used for development of a Croatian large vocabulary automatic speech recognition system (LVASR). The proposed acoustic model is based on context-dependent triphone hidden Markov models and Croatian phonetic rules. Different acoustic and language models, developed using a large collection of Croatian speech, are discussed and compared. The paper proposes the best feature vectors and acoustic modeling procedures using which lowest word error rates for Croatian speech are achieved. In addition, Croatian language modeling procedures are evaluated and adopted for speaker independent spontaneous speech recognition. Presented experiments and results show that the proposed approach for automatic speech recognition using context-dependent acoustic modeling based on Croatian phonetic rules and a parameter tying procedure can be used for efﬁcient Croatian large vocabulary speech recognition with word error rates below 5%.Članak prikazuje postupke akustičkog i jezičnog modeliranja sustava za automatsko raspoznavanje hrvatskoga govora velikoga vokabulara. Predloženi akustički modeli su zasnovani na kontekstno-ovisnim skrivenim Markovljevim modelima trifona i hrvatskim fonetskim pravilima. Na hrvatskome govoru prikupljenom u korpusu su ocjenjeni i uspoređeni različiti akustički i jezični modeli. U članku su uspoređ eni i predloženi postupci za izračun vektora značajki za akustičko modeliranje kao i sam pristup akustičkome modeliranju hrvatskoga govora s kojim je postignuta najmanja mjera pogrešno raspoznatih riječi. Predstavljeni su rezultati raspoznavanja spontanog hrvatskog govora neovisni o govorniku. Postignuti rezultati eksperimenata s mjerom pogreške ispod 5% ukazuju na primjerenost predloženih postupaka za automatsko raspoznavanje hrvatskoga govora velikoga vokabulara pomoću vezanih kontekstnoovisnih akustičkih modela na osnovu hrvatskih fonetskih pravila

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Automatic Intonation Event Detection Using Tilt Model for Croatian Speech Synthesis

Author: Ipšić Ivo
Martinčić-Ipšić Sanda
Načinović Lucia
Pobar Miran
Publication venue: Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb
Publication date: 01/11/2011
Field of study

Text-to-speech systems convert text into speech. Synthesized speech without prosody sounds unnatural and monotonous. In order to sound natural, prosodic elements have to be implemented. The generation of prosodic elements directly from text is a rather demanding task. Our final goals are building a complete prosodic model for Croatian and implementing it into our TTS system. In this work, we present one of the steps in implementation of prosody into TTSs – detection of intonation events using Tilt intonation model. We propose a training procedure which is composed of several subtasks. First, we hand-labelled a set of utterances and within each of them, marked four types of prosodic events. Then we trained HMMs and used them to mark prosodic events on a larger set of utterances. We estimate parameters for each of the intonation event and generated f0 contours from the parameters. Finally, we evaluated the obtained f0 contours

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Digitalni arhiv Filozofskog fakulteta u Zagrebu

TEXT-TO-SPEECH SYNTHESIS: A PROTOTYPE SYSTEM FOR CROATIAN LANGUAGE

Author: Ivo IPŠIĆ
Miran POBAR
Sanda MARTINČIĆ-IPŠIĆ
Publication venue: Faculty of Engineering/Faculty of Civil Engineering, University of Rijeka
Publication date: 01/01/2008
Field of study

U radu je prikazan sustav koji omogućuje umjetnu tvorbu hrvatskoga govora prema proizvoljnom ulaznom tekstu. Ulazni tekst, koji mora biti u normaliziranom obliku, sustav pretvara u niz fonema (pretvorba grafem-fonem), a zatim stvara zvučni zapis na temelju fonetskoga niza. Korišteni postupak sinteze temelji se na ulančavanju manjih akustičkih jedinica govora – difona metodom TD-PSOLA. Za potrebe sustava izrađena je i baza difona za hrvatski govor. Predložen je automatski postupak odabira difona iz govornoga korpusa. Kvaliteta ostvarenoga postupka ispitana je provođenjem ankete među ispitanicima. Ispitanici su dali subjektivnu ocjenu kvalitete dobivenoga govora, a time je provjerena i njegova razumljivost.This paper presents the development of a Croatian text-to-speech system capable of synthesizing speech from arbitrary text. Input text in normalized form is first transcribed into a phonetic string (grapheme-to-phoneme conversion) and then processed by a TD-PSOLA based synthesizer. A procedure for automatic selection of diphones from a spoken corpus is proposed. A Croatian language diphone database was built for the system. Subjective quality evaluations of the resulting speech were performed, as well as tests for intelligibility

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

TEXT-TO-SPEECH SYNTHESIS: A PROTOTYPE SYSTEM FOR CROATIAN LANGUAGE

Author: Ivo IPŠIĆ
Miran POBAR
Sanda MARTINČIĆ-IPŠIĆ
Publication venue: Faculty of Engineering/Faculty of Civil Engineering, University of Rijeka
Publication date: 01/01/2008
Field of study

Hrčak - Portal of scientific journals of Croatia

A Croatian Weather Domain Spoken Dialog System Prototype

Author: Ana Meštrović
Ivo Ipšić
Luka Bernić
Miran Pobar
Sanda Martinčić-Ipšić
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2010
Field of study

Speech technologies and language technologies have been already in use in IT for a certain time. Because of their great impact and fast growth, it is necessary to introduce these technologies for Croatian language. In this paper we propose a solution for developing a domain-oriented spoken dialog system for Croatian language. We have chosen a weather domain because it has limited vocabulary, it has easily accessible data and it is highly applicable. The Croatian weather dialog system provides information about weather in different regions of Croatia. The modules of the spoken dialog system perform automatic word recognition, semantic analysis, dialog management, response generation and text-to-speech synthesis. This is a first attempt to develop such a system for Croatian language and some new approaches are presented

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

KORPUS HRVATSKOGA GOVORA

Author: Ivo Ipšić
Mihaela Matešić
Sanda Martinčić Ipšić
Publication venue: Phonetics Section of the Croatian Philological Association
Publication date: 01/01/2004
Field of study

U radu je predstavljen korpus Imotskoga govora Odsjeka za informatiku Filozofskoga fakulteta u Rijeci. Korpus se sastoji od triju dijelova: radijskih i telefonskih vremenskih prognoza te televizijskih vijesti. Prikupljen je govor 250 različitih govornika u ukupnom trajanju od gotovo 20 sati. Prikupljen je čitani i spontani govor. Prikazani su struktura korpusa, njegova organizacija i osnovni statistički parametri. Opisani su postupci snimanja govora i transkripcije. U radu su predstavljeni korišteni alati CSLU SpeecliView, Transcriher i HTK, rječnik, koji sadržava sve riječi govornoga korpusa i njihov fonetski zapis te postupak validacije govornoga korpusa. U zaključnom dijelu predstavljeni su rezultati automatske segmentacije na fonetskoj razini

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

An Overview of the Slovenian Spoken Dialog System

Author: Ivo Ipšić
Nikola Pavešić
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2002
Field of study

In the paper we present the modules of the Slovenian spoken dialog system, developed within the joint project in multilingual speech recognition and understanding “Spoken Queries in European Languages” (SQEL-Copernicus-1634). The system can handle spontaneous speech and provide the user with correct information in the domain of air flight information retrieval. The major modules of the system perform word recognition, linguistic analysis, dialog management and speech synthesis. Some results with respect to word accuracy, semantic accuracy and dialog success rate are given, too

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Mogućnost primjene govora u računalnim igrama temeljenim na lokaciji

Author: Goran Paulin
Ivo Ipšić
Marina Ivašić‐Kos
Publication venue: Croatian Philological Association, Phonetics Section
Publication date: 01/01/2020
Field of study

Iako je govor u računalno sintetiziranom obliku postao dio računalnih igara već 1978. godine, njegova primjena, osobito u žanru računalnih igara temeljenih na lokaciji, slabo je istražena. U ovom radu predstavljen je pregled implementacije govora u računalnim igrama nastalim u razdoblju od 1978. do 2018. godine i dosadašnja iskustva njegove primjene. Fokus je stavljen na analizu mogućnosti korištenja govornih tehnologija u računalnim igrama temeljenim na lokaciji. Zaključak donosi odgovor na pitanje ima li smisla, s obzirom na specifičnosti žanra i aktualno stanje tehnologije, uvoditi govorne tehnologije u igre temeljene na lokaciji te koji su preduvjeti za to

Hrčak - Portal of scientific journals of Croatia