146 research outputs found

    Comparison between rule-based and data-driven natural language processing algorithms for Brazilian Portuguese speech synthesis

    Get PDF
    Due to the exponential growth in the use of computers, personal digital assistants and smartphones, the development of Text-to-Speech (TTS) systems have become highly demanded during the last years. An important part of these systems is the Text Analysis block, that converts the input text into linguistic specifications that are going to be used to generate the final speech waveform. The Natural Language Processing algorithms presented in this block are crucial to the quality of the speech generated by synthesizers. These algorithms are responsible for important tasks such as Grapheme-to-Phoneme Conversion, Syllabification and Stress Determination. For Brazilian Portuguese (BP), solutions for the algorithms presented in the Text Analysis block have been focused in rule-based approaches. These algorithms perform well for BP but have many disadvantages. On the other hand, there is still no research to evaluate and analyze the performance of data-driven approaches that reach state-of-the-art results for complex languages, such as English. So, in this work, we compare different data-driven approaches and rule-based approaches for NLP algorithms presented in a TTS system. Moreover, we propose, as a novel application, the use of Sequence-to-Sequence models as solution for the Syllabification and Stress Determination problems. As a brief summary of the results obtained, we show that data-driven algorithms can achieve state-of-the-art performance for the NLP algorithms presented in the Text Analysis block of a BP TTS system.Nos últimos anos, devido ao grande crescimento no uso de computadores, assistentes pessoais e smartphones, o desenvolvimento de sistemas capazes de converter texto em fala tem sido bastante demandado. O bloco de análise de texto, onde o texto de entrada é convertido em especificações linguísticas usadas para gerar a onda sonora final é uma parte importante destes sistemas. O desempenho dos algoritmos de Processamento de Linguagem Natural (NLP) presentes neste bloco é crucial para a qualidade dos sintetizadores de voz. Conversão Grafema-Fonema, separação silábica e determinação da sílaba tônica são algumas das tarefas executadas por estes algoritmos. Para o Português Brasileiro (BP), os algoritmos baseados em regras têm sido o foco na solução destes problemas. Estes algoritmos atingem bom desempenho para o BP, contudo apresentam diversas desvantagens. Por outro lado, ainda não há pesquisa no intuito de avaliar o desempenho de algoritmos data-driven, largamente utilizados para línguas complexas, como o inglês. Desta forma, expõe-se neste trabalho uma comparação entre diferentes técnicas data-driven e baseadas em regras para algoritmos de NLP utilizados em um sintetizador de voz. Além disso, propõe o uso de Sequence-to-Sequence models para a separação silábica e a determinação da tonicidade. Em suma, o presente trabalho demonstra que o uso de algoritmos data-driven atinge o estado-da-arte na performance dos algoritmos de Processamento de Linguagem Natural de um sintetizador de voz para o Português Brasileiro

    Weight gradience and stress in Portuguese

    Get PDF
    This paper examines the role of weight in stress assignment in the Portuguese lexicon, and proposes a probabilistic approach to stress. I show that weight effects are gradient and monotonically weaken as we move away from the right edge of the word. Such effects depend on the position of a syllable in the word as well as the number of segments the syllable contains. The probabilistic model proposed in this paper is based on a single predictor, namely, weight, and yields more accurate results than a categorical analysis, where weight is treated as binary. Finally, I discuss implications for the grammar of Portuguese

    SEA_AP: una herramienta de segmentación y etiquetado para el análisis prosódico

    Get PDF
    This paper introduces a tool that performs segmentation and labelling of sound chains in phono units, syllables and/or words departing from a sound signal and its corresponding orthographic transcription. In addition, it also integrates acoustic analysis scripts applied to the Praat programme with the aim of reducing the time spent on tasks related to analysis, correction, smoothing and generation of graphics of the melodic curve. The tool is implemented for Galician, Spanish and Brazilian Portuguese. Our goal is to contribute, by means of this application, to automatize some of the tasks of segmentation, labelling and prosodic analysis, since these tasks require a large investment of time and human resources.En este artículo se presenta una herramienta que realiza la segmentación y el etiquetado de cadenas sonoras en unidades de fono, sílaba y/o palabra partiendo de una señal sonora y de su correspondiente transcripción ortográfica. Además, integra scripts de análisis acústico que se ejecutan sobre el programa Praat con el fin de reducir el tiempo invertido en las tareas de análisis, corrección, suavizado y generación de gráficos de la curva melódica. La herramienta está implementada para gallego, español y portugués de Brasil. Nuestro objetivo es contribuir con esta aplicación a automatizar algunas de las labores de segmentación, etiquetado y análisis prosódico, pues constituyen tareas que requieren una gran inversión de tiempo y de recursos humanos.This work would have not been possible without the help of the Spanish Government (Project ‘SpeechTech4All’ TEC2012-38939-C03-01), the European Regional Development Fund (ERDF), the Government of the Autonomous Community of Galicia (GRC2014/024, “Consolidación de Unidades de Investigación: Proyecto AtlantTIC” CN2012/160) and the “Red de Investigación TecAnDAli” from the Council of Culture, Education and University Planning, Xunta de GaliciaS

    Re-imagining Brazilian Portuguese IPA: A practical guide utilizing Paulo Maron’s new opera Lampião

    Get PDF
    How often have North American singers considered singing art songs or opera arias in Brazilian Portuguese? How many Brazilian Opera composers do voice students and faculty outside of Brazil know? The lack of language familiarity of Brazilian Portuguese is a barrier to Brazilian vocal music’s accessibility and performance. And the challenging learning curve may contribute to the lack of interest non-native speakers may have toward Brazilian classical music. To help address this problem, the author decided to promote the accessibility of the Brazilian Portuguese repertoire of vocal music by re-imagining/simplifying sections of the Brazilian Portuguese IPA table. This simplified table coalesces phonemes from the Italian, French, and North American English IPA tables and diction concomitantly to a significant reduction of symbols compared to the Brazilian Portuguese IPA established in 2005. This reflective guide will apply practically the concepts and rules from this simplified Brazilian Portuguese IPA table through the transcription of the one act opera Lampião written by Brazilian composer Paulo Maron. In order to contextualize Brazilian Portuguese vocal music and Maron’s opera, a brief overview of Brazilian music, language and culture will contextualize elements that introduce North American anglophone singers to elements that are important to the performance of Brazilian vocal music and the Brazilian Portuguese texts they employ. This guide is universally applicable and directed to anyone working with music students or to the students themselves for private study. It is the author’s hope that this re-imagined/simplified Brazilian Portuguese IPA table will facilitate the engagement and performance of Brazilian Art Song and hopefully the production of Brazilian operas outside of Brazil

    Casa de la Lhéngua: A set of language resources and natural language processing tools for Mirandese

    Get PDF
    This paper describes the efforts for the construction of Language Resources and NLP tools for Mirandese, a minority language spoken in North-eastern Portugal, now available on a community-led portal, Casa de la Lhéngua. The resources were developed in the context of a collaborative citizenship project led by Microsoft, in the context of the creation of the first TTS system for Mirandese. Development efforts encompassed the compilation of a corpus with over 1M tokens, the construction of a GTP system, syllable-division, inflection and a Part-of-Speech (POS) tagger modules, leading to the creation of an inflected lexicon of about 200.000 entries with phonetic transcription, detailed POS tagging, syllable division, and stress mark-up. Alongside these tasks, which were made easier through the adaptation and reuse of existing tools for closely related languages, a casting for voice talents among the speaking community was conducted and the first speech database for speech synthesis was recorded for Mirandese. These resources were combined to fulfil the requirements of a well-tested statistical parameter synthesis model, leading to an intelligible voice font. These language resources are available freely at Casa de la Lhéngua, aiming at promoting the development of real-life applications and fostering linguistic research on Mirandese.info:eu-repo/semantics/publishedVersio

    a Northern Nambikwara language and its cultural context

    Get PDF
    Wetzels, W.L.M. [Promotor]Adelaar, W.F.H. [Copromotor

    Elements, Government, and Licensing: Developments in phonology

    Get PDF
    Elements, Government, and Licensing brings together new theoretical and empirical developments in phonology. It covers three principal domains of phonological representation: melody and segmental structure; tone, prosody and prosodic structure; and phonological relations, empty categories, and vowel-zero alternations. Theoretical topics covered include the formalisation of Element Theory, the hotly debated topic of structural recursion in phonology, and the empirical status of government. In addition, a wealth of new analyses and empirical evidence sheds new light on empty categories in phonology, the analysis of certain consonantal sequences, phonological and non-phonological alternation, the elemental composition of segments, and many more. Taking up long-standing empirical and theoretical issues informed by the Government Phonology and Element Theory, this book provides theoretical advances while also bringing to light new empirical evidence and analysis challenging previous generalisations. The insights offered here will be equally exciting for phonologists working on related issues inside and outside the Principles & Parameters programme, such as researchers working in Optimality Theory or classical rule-based phonology
    corecore