186 research outputs found
DNN-based Speech Synthesis for Indian Languages from ASCII text
Text-to-Speech synthesis in Indian languages has a seen lot of progress over
the decade partly due to the annual Blizzard challenges. These systems assume
the text to be written in Devanagari or Dravidian scripts which are nearly
phonemic orthography scripts. However, the most common form of computer
interaction among Indians is ASCII written transliterated text. Such text is
generally noisy with many variations in spelling for the same word. In this
paper we evaluate three approaches to synthesize speech from such noisy ASCII
text: a naive Uni-Grapheme approach, a Multi-Grapheme approach, and a
supervised Grapheme-to-Phoneme (G2P) approach. These methods first convert the
ASCII text to a phonetic script, and then learn a Deep Neural Network to
synthesize speech from that. We train and test our models on Blizzard Challenge
datasets that were transliterated to ASCII using crowdsourcing. Our experiments
on Hindi, Tamil and Telugu demonstrate that our models generate speech of
competetive quality from ASCII text compared to the speech synthesized from the
native scripts. All the accompanying transliterated datasets are released for
public access.Comment: 6 pages, 5 figures -- Accepted in 9th ISCA Speech Synthesis Worksho
Unsupervised learning for text-to-speech synthesis
This thesis introduces a general method for incorporating the distributional analysis
of textual and linguistic objects into text-to-speech (TTS) conversion systems.
Conventional TTS conversion uses intermediate layers of representation to bridge
the gap between text and speech. Collecting the annotated data needed to produce
these intermediate layers is a far from trivial task, possibly prohibitively so
for languages in which no such resources are in existence. Distributional analysis,
in contrast, proceeds in an unsupervised manner, and so enables the creation of
systems using textual data that are not annotated. The method therefore aids
the building of systems for languages in which conventional linguistic resources
are scarce, but is not restricted to these languages.
The distributional analysis proposed here places the textual objects analysed
in a continuous-valued space, rather than specifying a hard categorisation of those
objects. This space is then partitioned during the training of acoustic models for
synthesis, so that the models generalise over objects' surface forms in a way that
is acoustically relevant.
The method is applied to three levels of textual analysis: to the characterisation
of sub-syllabic units, word units and utterances. Entire systems for three
languages (English, Finnish and Romanian) are built with no reliance on manually
labelled data or language-specific expertise. Results of a subjective evaluation
are presented
Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution
Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding
The neurobiology of cortical music representations
Music is undeniable one of humanityâs defining traits, as it has been documented since the earliest
days of mankind, is present in all knowcultures and perceivable by all humans nearly alike.
Intrigued by its omnipresence, researchers of all disciplines started the investigation of musicâs
mystical relationship and tremendous significance to humankind already several hundred
years ago. Since comparably recently, the immense advancement of neuroscientific methods
also enabled the examination of cognitive processes related to the processing of music. Within
this neuroscience ofmusic, the vast majority of research work focused on how music, as an auditory
stimulus, reaches the brain and howit is initially processed, aswell as on the tremendous
effects it has on and can evoke through the human brain. However, intermediate steps, that is
how the human brain achieves a transformation of incoming signals to a seemingly specialized
and abstract representation of music have received less attention. Aiming to address this gap,
the here presented thesis targeted these transformations, their possibly underlying processes
and how both could potentially be explained through computational models. To this end, four
projects were conducted. The first two comprised the creation and implementation of two
open source toolboxes to first, tackle problems inherent to auditory neuroscience, thus also affecting
neuroscientific music research and second, provide the basis for further advancements
through standardization and automation. More precisely, this entailed deteriorated hearing
thresholds and abilities in MRI settings and the aggravated localization and parcellation of the
human auditory cortex as the core structure involved in auditory processing. The third project
focused on the humanâs brain apparent tuning to music by investigating functional and organizational
principles of the auditory cortex and network with regard to the processing of different
auditory categories of comparable social importance, more precisely if the perception of music
evokes a is distinct and specialized pattern. In order to provide an in depth characterization
of the respective patterns, both the segregation and integration of auditory cortex regions was
examined. In the fourth and final project, a highly multimodal approach that included fMRI,
EEG, behavior and models of varying complexity was utilized to evaluate how the aforementioned
music representations are generated along the cortical hierarchy of auditory processing
and how they are influenced by bottom-up and top-down processes. The results of project 1
and 2 demonstrated the necessity for the further advancement of MRI settings and definition
of working models of the auditory cortex, as hearing thresholds and abilities seem to vary as
a function of the used data acquisition protocol and the localization and parcellation of the
human auditory cortex diverges drastically based on the approach it is based one. Project 3
revealed that the human brain apparently is indeed tuned for music by means of a specialized
representation, as it evoked a bilateral network with a right hemispheric weight that was not
observed for the other included categories. The result of this specialized and hierarchical recruitment
of anterior and posterior auditory cortex regions was an abstract music component
ix
x SUMMARY
that is situated in anterior regions of the superior temporal gyrus and preferably encodes music,
regardless of sung or instrumental. The outcomes of project 4 indicated that even though
the entire auditory cortex, again with a right hemispheric weight, is involved in the complex
processing of music in particular, anterior regions yielded an abstract representation that varied
excessively over time and could not sufficiently explained by any of the tested models. The
specialized and abstract properties of this representation was furthermore underlined by the
predictive ability of the tested models, as models that were either based on high level features
such as behavioral representations and concepts or complex acoustic features always outperformed
models based on single or simpler acoustic features. Additionally, factors know to influence
auditory and thus music processing, like musical training apparently did not alter the
observed representations. Together, the results of the projects suggest that the specialized and
stable cortical representation of music is the outcome of sophisticated transformations of incoming
sound signals along the cortical hierarchy of auditory processing that generate a music
component in anterior regions of the superior temporal gyrus by means of top-down processes
that interact with acoustic features, guiding their processing.Musik ist unbestreitbarer Weise eine der definierenden Eigenschaften des Menschen. Dokumentiert
seit den fruÌhesten Tagen der Menschheit und in allen bekannten Kulturen vorhanden,
ist sie von allenMenschen nahezu gleichwahrnehmbar. Fasziniert von ihrerOmniprÀsenz
haben Wissenschaftler aller Disziplinen vor einigen hundert Jahren begonnen die mystische
Beziehung zwischen Musik und Mensch, sowie ihre enorme Bedeutung fuÌr selbigen zu untersuchen.
Seit einem vergleichsweise kurzem Zeitraum ist es durch den immensen Fortschritt
neurowissenschafticher Methoden auch möglich die kognitiven Prozesse, welche an der Verarbeitung
von Musik beteiligt, sind zu untersuchen. Innerhalb dieser Neurowissenschaft der
Musik hat sich ein GroĂteil der Forschungsarbeit darauf konzentriert wie Musik, als auditorischer
Stimulus, das menschliche Gehirn erreicht und wie sie initial verarbeitet wird, als auch
welche kolossallen Effekte sie auf selbiges hat und auch dadurch bewirken kann. Jedoch haben
die Zwischenschritte, also wie das menschliche Gehirn eintreffende Signale in eine scheinbar
spezialisierte und abstrakte ReprÀsentation vonMusik umwandelt, vergleichsweise wenig Aufmerksamkeit
erhalten. Um die dadurch entstandene LuÌcke zu adressieren, hat die hier vorliegende
Dissertation diese Prozesse und wie selbige durch Modelle erklÀrt werden können in
vier Projekten untersucht. Die ersten beiden Projekte beinhalteten die Herstellung und Implementierung
von zwei Toolboxen um erstens, inhÀrente Probleme der auditorischen Neurowissenschaft,
daher auch neurowissenschaftlicher Untersuchungen von Musik, zu verbessern
und zweitens, eine Basis fuÌr weitere Fortschritte durch Standardisierung und Automatisierung
zu schaffen. Im genaueren umfasste dies die stark beeintrÀchtigten Hörschwellen und
âfĂ€higkeiten in MRT-Untersuchungen und die erschwerte Lokalisation und Parzellierung des
menschlichen auditorischen Kortex als Kernstruktur auditiver Verarbeitung. Das dritte Projekt
befasste sich mit der augenscheinlichen Spezialisierung von Musik im menschlichen Gehirn
durch die Untersuchung funktionaler und organisatorischer Prinzipien des auditorischen
Kortex und Netzwerks bezuÌglich der Verarbeitung verschiedener auditorischer Kategorien vergleichbarer
sozialer Bedeutung, im genaueren ob die Wahrnehmung von Musik ein distinktes
und spezialisiertes neuronalenMuster hervorruft. Umeine ausfuÌhrliche Charakterisierung
der entsprechenden neuronalen Muster zu ermöglichen wurde die Segregation und Integration
der Regionen des auditorischen Kortex untersucht. Im vierten und letzten Projekt wurde
ein hochmultimodaler Ansatz,welcher fMRT, EEG, Verhalten undModelle verschiedener KomplexitÀt
beinhaltete, genutzt, umzu evaluieren, wie die zuvor genannten ReprÀsentationen von
Musik entlang der kortikalen Hierarchie der auditorischen Verarbeitung generiert und wie sie
möglicherweise durch Bottom-up- und Top-down-AnsÀtze beeinflusst werden. Die Ergebnisse
von Projekt 1 und 2 demonstrierten die Notwendigkeit fuÌr weitere Verbesserungen von MRTUntersuchungen
und die Definition eines Funktionsmodells des auditorischen Kortex, daHörxi
xii ZUSAMMENFASSUNG
schwellen und âfĂ€higkeiten stark in AbhĂ€ngigkeit der verwendeten Datenerwerbsprotokolle
variierten und die Lokalisation, sowie Parzellierung des menschlichen auditorischen Kortex
basierend auf den zugrundeliegenden AnsÀtzen drastisch divergiert. Projekt 3 zeigte, dass das
menschliche Gehirn tatsÀchlich eine spezialisierte ReprÀsentation vonMusik enthÀlt, da selbige
als einzige auditorische Kategorie ein bilaterales Netzwerk mit rechtshemisphÀrischer Gewichtung
evozierte. Aus diesemNetzwerk, welches die Rekrutierung anteriorer und posteriorer
Teile des auditorischen Kortex beinhaltete, resultierte eine scheinbar abstrakte ReprÀsentation
von Musik in anterioren Regionen des Gyrus temporalis superior, welche prÀferiert Musik enkodiert,
ungeachtet ob gesungen oder instrumental. Die Resultate von Projekt 4 deuten darauf
hin, dass der gesamte auditorische Kortex, erneut mit rechtshemisphÀrischer Gewichtung, an
der komplexen Verarbeitung vonMusik beteiligt ist, besonders aber anteriore Regionen, die bereits
genannten abstrakte ReprĂ€sentation hervorrufen, welche sich exzessiv uÌber die Zeitdauer
derWahrnehmung verÀndert und nicht hinreichend durch eines der getestetenModelle erklÀrt
werden kann. Die spezialisierten und abstrakten Eigenschaften dieser ReprÀsentationen wurden
weiterhin durch die prÀdiktiven FÀhigkeiten der getestetenModelle unterstrichen, daModelle,
welche entweder auf höheren Eigenschaften wie VerhaltensreprÀsentationen und mentalen
Konzepten oder komplexen akustischen Eigenschaften basierten, stets Modelle, welche
auf niederen Attributen wie simplen akustischen Eigenschaften basierten, uÌbertrafen. ZusĂ€tzlich
konnte kein Effekt von Faktoren, wie z.B. musikalisches Training, welche bekanntermaĂen
auditorische und daherMusikverarbeitung beeinflussen, nachgewiesen werden.
Zusammengefasst deuten die Ergebnisse der Projekte darauf, hin dass die spezialisierte und
stabile kortikale ReprÀsentation vonMusik ein Resultat komplexer Prozesse ist, welche eintreffende
Signale entlang der kortikalen Hierarchie auditorischer Verarbeitung in eine abstrakte
ReprÀsentation vonMusik innerhalb anteriorer Regionen des Gyrus temporalis superior durch
Top-Down-Prozesse, welche mit akustischen Eigenschaften interagieren und deren Verarbeitung
steuern, umwandeln
Powers, inequalities and vulnerabilities
This research addresses the gap that is present in both missiology and family and youth ministry. Missiology does not focus on children and youth specifically, while this is the largest population in the developing world. On the other hand, family and youth ministry has a more pastoral than missional approach, not always taking cognisance of contexts like globalisation. Thus, the purpose of the book is to address the sometimes unintended and unnoticed influence of globalisation on the mission of the church, with a specific focus on children, youth and family. For this purpose, the International Association for Mission Studies study group for children, youth and families coming from different parts of the world decided to describe the powers, inequalities and vulnerabilities of children, youth and families in a globalised world from their specific contexts. Although the most prominent research methodology was critical literature studies, methods like autoethnographic, and empirical methods were also used. No decisions were made on a specific method of research for this publication. This publication can be viewed as an interdisciplinary and intra-disciplinary, because it deals with social sciences, anthropology, psychology, missiology, systematic theology and practical theology
- âŠ