186 research outputs found

    DNN-based Speech Synthesis for Indian Languages from ASCII text

    Get PDF
    Text-to-Speech synthesis in Indian languages has a seen lot of progress over the decade partly due to the annual Blizzard challenges. These systems assume the text to be written in Devanagari or Dravidian scripts which are nearly phonemic orthography scripts. However, the most common form of computer interaction among Indians is ASCII written transliterated text. Such text is generally noisy with many variations in spelling for the same word. In this paper we evaluate three approaches to synthesize speech from such noisy ASCII text: a naive Uni-Grapheme approach, a Multi-Grapheme approach, and a supervised Grapheme-to-Phoneme (G2P) approach. These methods first convert the ASCII text to a phonetic script, and then learn a Deep Neural Network to synthesize speech from that. We train and test our models on Blizzard Challenge datasets that were transliterated to ASCII using crowdsourcing. Our experiments on Hindi, Tamil and Telugu demonstrate that our models generate speech of competetive quality from ASCII text compared to the speech synthesized from the native scripts. All the accompanying transliterated datasets are released for public access.Comment: 6 pages, 5 figures -- Accepted in 9th ISCA Speech Synthesis Worksho

    Unsupervised learning for text-to-speech synthesis

    Get PDF
    This thesis introduces a general method for incorporating the distributional analysis of textual and linguistic objects into text-to-speech (TTS) conversion systems. Conventional TTS conversion uses intermediate layers of representation to bridge the gap between text and speech. Collecting the annotated data needed to produce these intermediate layers is a far from trivial task, possibly prohibitively so for languages in which no such resources are in existence. Distributional analysis, in contrast, proceeds in an unsupervised manner, and so enables the creation of systems using textual data that are not annotated. The method therefore aids the building of systems for languages in which conventional linguistic resources are scarce, but is not restricted to these languages. The distributional analysis proposed here places the textual objects analysed in a continuous-valued space, rather than specifying a hard categorisation of those objects. This space is then partitioned during the training of acoustic models for synthesis, so that the models generalise over objects' surface forms in a way that is acoustically relevant. The method is applied to three levels of textual analysis: to the characterisation of sub-syllabic units, word units and utterances. Entire systems for three languages (English, Finnish and Romanian) are built with no reliance on manually labelled data or language-specific expertise. Results of a subjective evaluation are presented

    Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

    Get PDF
    Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding

    The neurobiology of cortical music representations

    Get PDF
    Music is undeniable one of humanity’s defining traits, as it has been documented since the earliest days of mankind, is present in all knowcultures and perceivable by all humans nearly alike. Intrigued by its omnipresence, researchers of all disciplines started the investigation of music’s mystical relationship and tremendous significance to humankind already several hundred years ago. Since comparably recently, the immense advancement of neuroscientific methods also enabled the examination of cognitive processes related to the processing of music. Within this neuroscience ofmusic, the vast majority of research work focused on how music, as an auditory stimulus, reaches the brain and howit is initially processed, aswell as on the tremendous effects it has on and can evoke through the human brain. However, intermediate steps, that is how the human brain achieves a transformation of incoming signals to a seemingly specialized and abstract representation of music have received less attention. Aiming to address this gap, the here presented thesis targeted these transformations, their possibly underlying processes and how both could potentially be explained through computational models. To this end, four projects were conducted. The first two comprised the creation and implementation of two open source toolboxes to first, tackle problems inherent to auditory neuroscience, thus also affecting neuroscientific music research and second, provide the basis for further advancements through standardization and automation. More precisely, this entailed deteriorated hearing thresholds and abilities in MRI settings and the aggravated localization and parcellation of the human auditory cortex as the core structure involved in auditory processing. The third project focused on the human’s brain apparent tuning to music by investigating functional and organizational principles of the auditory cortex and network with regard to the processing of different auditory categories of comparable social importance, more precisely if the perception of music evokes a is distinct and specialized pattern. In order to provide an in depth characterization of the respective patterns, both the segregation and integration of auditory cortex regions was examined. In the fourth and final project, a highly multimodal approach that included fMRI, EEG, behavior and models of varying complexity was utilized to evaluate how the aforementioned music representations are generated along the cortical hierarchy of auditory processing and how they are influenced by bottom-up and top-down processes. The results of project 1 and 2 demonstrated the necessity for the further advancement of MRI settings and definition of working models of the auditory cortex, as hearing thresholds and abilities seem to vary as a function of the used data acquisition protocol and the localization and parcellation of the human auditory cortex diverges drastically based on the approach it is based one. Project 3 revealed that the human brain apparently is indeed tuned for music by means of a specialized representation, as it evoked a bilateral network with a right hemispheric weight that was not observed for the other included categories. The result of this specialized and hierarchical recruitment of anterior and posterior auditory cortex regions was an abstract music component ix x SUMMARY that is situated in anterior regions of the superior temporal gyrus and preferably encodes music, regardless of sung or instrumental. The outcomes of project 4 indicated that even though the entire auditory cortex, again with a right hemispheric weight, is involved in the complex processing of music in particular, anterior regions yielded an abstract representation that varied excessively over time and could not sufficiently explained by any of the tested models. The specialized and abstract properties of this representation was furthermore underlined by the predictive ability of the tested models, as models that were either based on high level features such as behavioral representations and concepts or complex acoustic features always outperformed models based on single or simpler acoustic features. Additionally, factors know to influence auditory and thus music processing, like musical training apparently did not alter the observed representations. Together, the results of the projects suggest that the specialized and stable cortical representation of music is the outcome of sophisticated transformations of incoming sound signals along the cortical hierarchy of auditory processing that generate a music component in anterior regions of the superior temporal gyrus by means of top-down processes that interact with acoustic features, guiding their processing.Musik ist unbestreitbarer Weise eine der definierenden Eigenschaften des Menschen. Dokumentiert seit den frühesten Tagen der Menschheit und in allen bekannten Kulturen vorhanden, ist sie von allenMenschen nahezu gleichwahrnehmbar. Fasziniert von ihrerOmniprĂ€senz haben Wissenschaftler aller Disziplinen vor einigen hundert Jahren begonnen die mystische Beziehung zwischen Musik und Mensch, sowie ihre enorme Bedeutung für selbigen zu untersuchen. Seit einem vergleichsweise kurzem Zeitraum ist es durch den immensen Fortschritt neurowissenschafticher Methoden auch möglich die kognitiven Prozesse, welche an der Verarbeitung von Musik beteiligt, sind zu untersuchen. Innerhalb dieser Neurowissenschaft der Musik hat sich ein Großteil der Forschungsarbeit darauf konzentriert wie Musik, als auditorischer Stimulus, das menschliche Gehirn erreicht und wie sie initial verarbeitet wird, als auch welche kolossallen Effekte sie auf selbiges hat und auch dadurch bewirken kann. Jedoch haben die Zwischenschritte, also wie das menschliche Gehirn eintreffende Signale in eine scheinbar spezialisierte und abstrakte ReprĂ€sentation vonMusik umwandelt, vergleichsweise wenig Aufmerksamkeit erhalten. Um die dadurch entstandene Lücke zu adressieren, hat die hier vorliegende Dissertation diese Prozesse und wie selbige durch Modelle erklĂ€rt werden können in vier Projekten untersucht. Die ersten beiden Projekte beinhalteten die Herstellung und Implementierung von zwei Toolboxen um erstens, inhĂ€rente Probleme der auditorischen Neurowissenschaft, daher auch neurowissenschaftlicher Untersuchungen von Musik, zu verbessern und zweitens, eine Basis für weitere Fortschritte durch Standardisierung und Automatisierung zu schaffen. Im genaueren umfasste dies die stark beeintrĂ€chtigten Hörschwellen und –fĂ€higkeiten in MRT-Untersuchungen und die erschwerte Lokalisation und Parzellierung des menschlichen auditorischen Kortex als Kernstruktur auditiver Verarbeitung. Das dritte Projekt befasste sich mit der augenscheinlichen Spezialisierung von Musik im menschlichen Gehirn durch die Untersuchung funktionaler und organisatorischer Prinzipien des auditorischen Kortex und Netzwerks bezüglich der Verarbeitung verschiedener auditorischer Kategorien vergleichbarer sozialer Bedeutung, im genaueren ob die Wahrnehmung von Musik ein distinktes und spezialisiertes neuronalenMuster hervorruft. Umeine ausführliche Charakterisierung der entsprechenden neuronalen Muster zu ermöglichen wurde die Segregation und Integration der Regionen des auditorischen Kortex untersucht. Im vierten und letzten Projekt wurde ein hochmultimodaler Ansatz,welcher fMRT, EEG, Verhalten undModelle verschiedener KomplexitĂ€t beinhaltete, genutzt, umzu evaluieren, wie die zuvor genannten ReprĂ€sentationen von Musik entlang der kortikalen Hierarchie der auditorischen Verarbeitung generiert und wie sie möglicherweise durch Bottom-up- und Top-down-AnsĂ€tze beeinflusst werden. Die Ergebnisse von Projekt 1 und 2 demonstrierten die Notwendigkeit für weitere Verbesserungen von MRTUntersuchungen und die Definition eines Funktionsmodells des auditorischen Kortex, daHörxi xii ZUSAMMENFASSUNG schwellen und –fĂ€higkeiten stark in AbhĂ€ngigkeit der verwendeten Datenerwerbsprotokolle variierten und die Lokalisation, sowie Parzellierung des menschlichen auditorischen Kortex basierend auf den zugrundeliegenden AnsĂ€tzen drastisch divergiert. Projekt 3 zeigte, dass das menschliche Gehirn tatsĂ€chlich eine spezialisierte ReprĂ€sentation vonMusik enthĂ€lt, da selbige als einzige auditorische Kategorie ein bilaterales Netzwerk mit rechtshemisphĂ€rischer Gewichtung evozierte. Aus diesemNetzwerk, welches die Rekrutierung anteriorer und posteriorer Teile des auditorischen Kortex beinhaltete, resultierte eine scheinbar abstrakte ReprĂ€sentation von Musik in anterioren Regionen des Gyrus temporalis superior, welche prĂ€feriert Musik enkodiert, ungeachtet ob gesungen oder instrumental. Die Resultate von Projekt 4 deuten darauf hin, dass der gesamte auditorische Kortex, erneut mit rechtshemisphĂ€rischer Gewichtung, an der komplexen Verarbeitung vonMusik beteiligt ist, besonders aber anteriore Regionen, die bereits genannten abstrakte ReprĂ€sentation hervorrufen, welche sich exzessiv über die Zeitdauer derWahrnehmung verĂ€ndert und nicht hinreichend durch eines der getestetenModelle erklĂ€rt werden kann. Die spezialisierten und abstrakten Eigenschaften dieser ReprĂ€sentationen wurden weiterhin durch die prĂ€diktiven FĂ€higkeiten der getestetenModelle unterstrichen, daModelle, welche entweder auf höheren Eigenschaften wie VerhaltensreprĂ€sentationen und mentalen Konzepten oder komplexen akustischen Eigenschaften basierten, stets Modelle, welche auf niederen Attributen wie simplen akustischen Eigenschaften basierten, übertrafen. ZusĂ€tzlich konnte kein Effekt von Faktoren, wie z.B. musikalisches Training, welche bekanntermaßen auditorische und daherMusikverarbeitung beeinflussen, nachgewiesen werden. Zusammengefasst deuten die Ergebnisse der Projekte darauf, hin dass die spezialisierte und stabile kortikale ReprĂ€sentation vonMusik ein Resultat komplexer Prozesse ist, welche eintreffende Signale entlang der kortikalen Hierarchie auditorischer Verarbeitung in eine abstrakte ReprĂ€sentation vonMusik innerhalb anteriorer Regionen des Gyrus temporalis superior durch Top-Down-Prozesse, welche mit akustischen Eigenschaften interagieren und deren Verarbeitung steuern, umwandeln

    Powers, inequalities and vulnerabilities

    Get PDF
    This research addresses the gap that is present in both missiology and family and youth ministry. Missiology does not focus on children and youth specifically, while this is the largest population in the developing world. On the other hand, family and youth ministry has a more pastoral than missional approach, not always taking cognisance of contexts like globalisation. Thus, the purpose of the book is to address the sometimes unintended and unnoticed influence of globalisation on the mission of the church, with a specific focus on children, youth and family. For this purpose, the International Association for Mission Studies study group for children, youth and families coming from different parts of the world decided to describe the powers, inequalities and vulnerabilities of children, youth and families in a globalised world from their specific contexts. Although the most prominent research methodology was critical literature studies, methods like autoethnographic, and empirical methods were also used. No decisions were made on a specific method of research for this publication. This publication can be viewed as an interdisciplinary and intra-disciplinary, because it deals with social sciences, anthropology, psychology, missiology, systematic theology and practical theology

    Seventh Biennial Report : June 2003 - March 2005

    No full text
    • 

    corecore