21 research outputs found

    Proceedings

    Get PDF
    Proceedings of the NODALIDA 2009 workshop Nordic Perspectives on the CLARIN Infrastructure of Language Resources. Editors: Rickard Domeij, Kimmo Koskenniemi, Steven Krauwer, Bente Maegaard, Eiríkur Rögnvaldsson and Koenraad de Smedt. NEALT Proceedings Series, Vol. 5 (2009), v+45 pp. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9207

    Terminology Integration in Statistical Machine Translation

    Get PDF
    Elektroniskā versija nesatur pielikumusPromocijas darbs apraksta autora izpētītas metodes un izstrādātus rīkus divvalodu terminoloģijas integrācijai statistiskās mašīntulkošanas sistēmās. Autors darbā piedāvā inovatīvas metodes terminu integrācijai SMT sistēmu trenēšanas fāzē (ar statiskas integrācijas palīdzību) un tulkošanas fāzē (ar dinamiskas integrācijas palīdzību). Darbā uzmanība pievērsta ne tikai metodēm terminu integrācijai SMT, bet arī metodēm valodas resursu, kas nepieciešami dažādu uzdevumu veikšanai terminu integrācijas SMT darbplūsmās, ieguvei. Piedāvātās metodes ir novērtētas automātiskas un manuālas novērtēšanas eksperimentos. Iegūtie rezultāti parāda, ka statiskās un dinamiskās integrācijas metodes ļauj būtiski uzlabot tulkošanas kvalitāti. Darbā aprakstītie rezultāti ir aprobēti vairākos pētniecības projektos un ieviesti praktiskos risinājumos. Atslēgvārdi: statistiskā mašīntulkošana, terminoloģija, starpvalodu informācijas izvilkšanaThe doctoral thesis describes methods and tools researched and developed by the author for bilingual terminology integration into statistical machine translation systems. The author presents novel methods for terminology integration in SMT systems during training (through static integration) and during translation (through dynamic integration). The work focusses not only on the SMT integration techniques, but also on methods for acquisition of linguistic resources that are necessary for different tasks involved in workflows for terminology integration in SMT systems. The proposed methods have been evaluated using automatic and manual evaluation methods. The results show that both static and dynamic integration methods allow increasing translation quality. The thesis describes also areas where the methods have been approbated in practice. Keywords: statistical machine translation, terminology, cross-lingual information extractio

    Põhiemotsioonid eestikeelses etteloetud kõnes: akustiline analüüs ja modelleerimine

    Get PDF
    Väitekirja elektrooniline versioon ei sisalda publikatsiooneDoktoritööl oli kaks eesmärki: saada teada, milline on kolme põhiemotsiooni – rõõmu, kurbuse ja viha – akustiline väljendumine eestikeelses etteloetud kõnes, ning luua neile uurimistulemustele tuginedes eestikeelsele kõnesüntesaatorile parameetrilise sünteesi jaoks emotsionaalse kõne akustilised mudelid, mis aitaksid süntesaatoril äratuntavalt nimetatud emotsioone väljendada. Kuna sünteeskõnet rakendatakse paljudes valdkondades, näiteks inimese ja masina suhtluses, multimeedias või puuetega inimeste abivahendites, siis on väga oluline, et sünteeskõne kõlaks loomulikuna, võimalikult inimese rääkimise moodi. Üks viis sünteeskõne loomulikumaks muuta on lisada sellesse emotsioone, tehes seda mudelite abil, mis annavad süntesaatorile ette emotsioonide väljendamiseks vajalikud akustiliste parameetrite väärtuste kombinatsioonid. Emotsionaalse kõne mudelite loomiseks peab teadma, kuidas emotsioonid inimkõnes hääleliselt väljenduvad. Selleks tuli uurida, kas, millisel määral ja mis suunas emotsioonid akustiliste parameetrite (näiteks põhitooni, intensiivsuse ja kõnetempo) väärtusi mõjutavad ning millised parameetrid võimaldavad emotsioone üksteisest ja neutraalsest kõnest eristada. Saadud tulemuste põhjal oli võimalik luua emotsioonide akustilisi mudeleid* ning katseisikud hindasid, milliste mudelite järgi on emotsioonid sünteeskõnes äratuntavad. Eksperiment kinnitas, et akustikaanalüüsi tulemustele tuginevate mudelitega suudab eestikeelne kõnesüntesaator rahuldavalt väljendada nii kurbust kui ka viha, kuid mitte rõõmu. Doktoritöö kajastab üht võimalikku viisi, kuidas rõõm, kurbus ja viha eestikeelses kõnes hääleliselt väljenduvad, ning esitab mudelid, mille abil emotsioone eestikeelsesse sünteeskõnesse lisada. Uurimistöö on lähtepunkt edasisele eestikeelse emotsionaalse sünteeskõne akustiliste mudelite arendamisele. * Katsemudelite järgi sünteesitud emotsionaalset kõnet saab kuulata aadressil https://www.eki.ee/heli/index.php?option=com_content&view=article&id=7&Itemid=494.The present doctoral dissertation had two major purposes: (a) to find out and describe the acoustic expression of three basic emotions – joy, sadness and anger – in read Estonian speech, and (b) to create, based on the resulting description, acoustic models of emotional speech, designed to help parametric synthesis of Estonian speech recognizably express the above emotions. As far as synthetic speech has many applications in different fields, such as human-machine interaction, multimedia, or aids for the disabled, it is vital that the synthetic speech should sound natural, that is, as human-like as possible. One of the ways to naturalness lies through adding emotions to the synthetic speech by means of models feeding the synthesiser with combinations of acoustic parametric values necessary for emotional expression. In order to create such models of emotional speech, it is first necessary to have a detailed knowledge of the vocal expression of emotions in human speech. For that purpose I had to investigate to what extent, if any, and in what direction emotions influence the values of speech acoustic parameters (e.g., fundamental frequency, intensity and speech rate), and which parameters enable discrimination of emotions from each other and from neutral speech. The results provided material for creating acoustic models of emotions* to be presented to evaluators, who were asked to decide which of the models helped to produce synthetic speech with recognisable emotions. The experiment proved that with models based on acoustic results, an Estonian speech synthesiser can satisfactorily express sadness and anger, while joy was not so well recognised by listeners. This doctoral dissertation describes one of the possible ways for the vocal expression of joy, sadness and anger in Estonian speech and presents some models enabling addition of emotions to Estonian synthetic speech. The study serves as a starting point for the future development of acoustic models for Estonian emotional synthetic speech. * Recorded examples of emotional speech synthesised using the test models can be accessed at https://www.eki.ee/heli/index.php?option=com_content&view=article&id=7&Itemid=494

    Keskusteluavustimen kehittäminen kuulovammaisia varten automaattista puheentunnistusta käyttäen

    Get PDF
    Understanding and participating in conversations has been reported as one of the biggest challenges hearing impaired people face in their daily lives. These communication problems have been shown to have wide-ranging negative consequences, affecting their quality of life and the opportunities available to them in education and employment. A conversational assistance application was investigated to alleviate these problems. The application uses automatic speech recognition technology to provide real-time speech-to-text transcriptions to the user, with the goal of helping deaf and hard of hearing persons in conversational situations. To validate the method and investigate its usefulness, a prototype application was developed for testing purposes using open-source software. A user test was designed and performed with test participants representing the target user group. The results indicate that the Conversation Assistant method is valid, meaning it can help the hearing impaired to follow and participate in conversational situations. Speech recognition accuracy, especially in noisy environments, was identified as the primary target for further development for increased usefulness of the application. Conversely, recognition speed was deemed to be sufficient and already surpass the transcription speed of human transcribers.Keskustelupuheen ymmärtäminen ja keskusteluihin osallistuminen on raportoitu yhdeksi suurimmista haasteista, joita kuulovammaiset kohtaavat jokapäiväisessä elämässään. Näillä viestintäongelmilla on osoitettu olevan laaja-alaisia negatiivisia vaikutuksia, jotka heijastuvat elämänlaatuun ja heikentävät kuulovammaisten yhdenvertaisia osallistumismahdollisuuksia opiskeluun ja työelämään. Työssä kehitettiin ja arvioitiin apusovellusta keskustelupuheen ymmärtämisen ja keskusteluihin osallistumisen helpottamiseksi. Sovellus käyttää automaattista puheentunnistusta reaaliaikaiseen puheen tekstittämiseen kuuroja ja huonokuuloisia varten. Menetelmän toimivuuden vahvistamiseksi ja sen hyödyllisyyden tutkimiseksi siitä kehitettiin prototyyppisovellus käyttäjätestausta varten avointa lähdekoodia hyödyntäen. Testaamista varten suunniteltiin ja toteutettiin käyttäjäkoe sovelluksen kohderyhmää edustavilla koekäyttäjillä. Saadut tulokset viittaavat siihen, että työssä esitetty Keskusteluavustin on toimiva ja hyödyllinen apuväline huonokuuloisille ja kuuroille. Puheentunnistustarkkuus erityisesti meluisissa olosuhteissa osoittautui ensisijaiseksi kehityskohteeksi apusovelluksen hyödyllisyyden lisäämiseksi. Puheentunnistuksen nopeus arvioitiin puolestaan jo riittävän nopeaksi, ylittäen selkeästi kirjoitustulkkien kirjoitusnopeuden

    CLARIN

    Get PDF
    The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium

    CLARIN. The infrastructure for language resources

    Get PDF
    CLARIN, the "Common Language Resources and Technology Infrastructure", has established itself as a major player in the field of research infrastructures for the humanities. This volume provides a comprehensive overview of the organization, its members, its goals and its functioning, as well as of the tools and resources hosted by the infrastructure. The many contributors representing various fields, from computer science to law to psychology, analyse a wide range of topics, such as the technology behind the CLARIN infrastructure, the use of CLARIN resources in diverse research projects, the achievements of selected national CLARIN consortia, and the challenges that CLARIN has faced and will face in the future. The book will be published in 2022, 10 years after the establishment of CLARIN as a European Research Infrastructure Consortium by the European Commission (Decision 2012/136/EU)

    CLARIN

    Get PDF
    The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium

    Handbook of Easy Languages in Europe

    Get PDF
    The Handbook of Easy Languages in Europe describes what Easy Language is and how it is used in European countries. It demonstrates the great diversity of actors, instruments and outcomes related to Easy Language throughout Europe. All people, despite their limitations, have an equal right to information, inclusion, and social participation. This results in requirements for understandable language. The notion of Easy Language refers to modified forms of standard languages that aim to facilitate reading and language comprehension. This handbook describes the historical background, the principles and the practices of Easy Language in 21 European countries. Its topics include terminological definitions, legal status, stakeholders, target groups, guidelines, practical outcomes, education, research, and a reflection on future perspectives related to Easy Language in each country. Written in an academic yet interesting and understandable style, this Handbook of Easy Languages in Europe aims to find a wide audience

    Handbook of Easy Languages in Europe

    Get PDF
    The Handbook of Easy Languages in Europe describes what Easy Language is and how it is used in European countries. It demonstrates the great diversity of actors, instruments and outcomes related to Easy Language throughout Europe. All people, despite their limitations, have an equal right to information, inclusion, and social participation. This results in requirements for understandable language. The notion of Easy Language refers to modified forms of standard languages that aim to facilitate reading and language comprehension. This handbook describes the historical background, the principles and the practices of Easy Language in 21 European countries. Its topics include terminological definitions, legal status, stakeholders, target groups, guidelines, practical outcomes, education, research, and a reflection on future perspectives related to Easy Language in each country. Written in an academic yet interesting and understandable style, this Handbook of Easy Languages in Europe aims to find a wide audience
    corecore