513 research outputs found
Macedonian Speech Synthesis for Assistive Technology Applications
Speech technology is becoming ever more ubiquitous with the advance of speech
enabled devices and services. The use of speech synthesis in Augmentative and
Alternative Communication tools, has facilitated inclusion of individuals with
speech impediments allowing them to communicate with their surroundings using
speech. Although there are numerous speech synthesis systems for the most
spoken world languages, there is still a limited offer for smaller languages.
We propose and compare three models built using parametric and deep learning
techniques for Macedonian trained on a newly recorded corpus. We target
low-resource edge deployment for Augmentative and Alternative Communication and
assistive technologies, such as communication boards and screen readers. The
listening test results show that parametric speech synthesis is as performant
compared to the more advanced deep learning models. Since it also requires less
resources, and offers full speech rate and pitch control, it is the preferred
choice for building a Macedonian TTS system for this application scenario.Comment: 5 pages, 1 figure, EUSIPCO conference 202
Natural language processing for similar languages, varieties, and dialects: A survey
There has been a lot of recent interest in the natural language processing (NLP) community in the computational processing of language varieties and dialects, with the aim to improve the performance of applications such as machine translation, speech recognition, and dialogue systems. Here, we attempt to survey this growing field of research, with focus on computational methods for processing similar languages, varieties, and dialects. In particular, we discuss the most important challenges when dealing with diatopic language variation, and we present some of the available datasets, the process of data collection, and the most common data collection strategies used to compile datasets for similar languages, varieties, and dialects. We further present a number of studies on computational methods developed and/or adapted for preprocessing, normalization, part-of-speech tagging, and parsing similar languages, language varieties, and dialects. Finally, we discuss relevant applications such as language and dialect identification and machine translation for closely related languages, language varieties, and dialects.Non peer reviewe
Spraying Religion: (Anti-)Religious Graffiti of the Post-Socialist Transition
This article discusses graffiti and street art concerning religion, part of the author\u27s much broader and continuous research on contemporary political graffiti and street art in post-socialist Central and Eastern Europe, from the Baltics to the Balkans, from Prague to Moscow, comprising over 20 years of systematic fieldwork
MaCoCu:Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages
We introduce the project MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages, funded by the Connecting Europe Facility, which is aimed at building monolingual and parallel corpora for under-resourced European languages. The approach followed consists of crawling large amounts of textual data from selected top-level domains of the Internet, and then applying a curation and enrichment pipeline. In addition to corpora, the project will release the free/open-source web crawling and curation software used.</p
- …