649 research outputs found

    Parsing for prosody: What a text-to-speech system needs from syntax

    Get PDF
    The authors describe an experimental text-to-speech system that uses a syntactic parser and prosody rules to determine prosodic phrasing for synthesized speech. It is shown that many aspects of sentence analysis that are required for other parsing applications, e.g., machine translation and question answering, become unnecessary in parsing for text-to-speech. It is possible to generate natural-sounding prosodic phrasing by relying on information about syntactic category type, partial constituency, and length; information about clausal and verb phrase constituency, predicate-argument relations, and prepositional phrase attachment can be bypassed

    Neurophysiological correlates of musical and prosodic phrasing: shared processing mechanisms and effects of musical expertise

    Get PDF
    The processing of prosodic phrase boundaries in language is immediately reflected by a specific event-related potential component called the Closure Positive Shift (CPS). A component somewhat reminiscent of the CPS in language has also been reported for musical phrases (i.e., the so-called ‘music CPS’). However, in previous studies the quantification of the music-CPS as well as its morphology and timing differed substantially from the characteristics of the language-CPS. Therefore, the degree of correspondence between cognitive mechanisms of phrasing in music and in language has remained questionable. Here, we probed the shared nature of mechanisms underlying musical and prosodic phrasing by (1) investigating whether the music-CPS is present at phrase boundary positions where the language-CPS has been originally reported (i.e., at the onset of the pause between phrases), and (2) comparing the CPS in music and in language in non-musicians and professional musicians. For the first time, we report a positive shift at the onset of musical phrase boundaries that strongly resembles the language-CPS and argue that the post- boundary ‘music-CPS’ of previous studies may be an entirely distinct ERP component. Moreover, the language-CPS in musicians was found to be less prominent than in non-musicians, suggesting more efficient processing of prosodic phrases in language as a result of higher musical expertise

    Perception of phrasal prosody in the acquisition of European Portuguese

    Get PDF
    A central issue in language acquisition is the segmentation of speech into linguistic units and structures. This thesis examines the role played by phrasal prosody in speech segmentation in the acquisition of European Portuguese, both in the processing of globally ambiguous sentences by 4 and 5 year old children and in early word segmentation by 12 month-old infants. Past studies have shown that phrasal prosody is used by adults in ambiguity resolution, for example to disambiguate syntactically ambiguous sentences involving a low or high attachment interpretation of a given phrase (e.g, Hide the rabbit with a cloth). In a first exploratory experiment, and given previous unclear findings in the literature on European Portuguese, we investigated whether prosodic phrasing might guide speech chunking and interpretation of these globally ambiguous sentences by adult listeners. In an eye-tracking experiment, which also included a pointing task, we found that EP adult speakers were not able to use phrasal prosody to disambiguate the structures tested. Both the results from eye gaze and the pointing task indicated the presence of a high attachment preference in the language, regardless of phrasal prosody. These findings required a better understanding of adult interpretation of these utterances before a productive study could be conducted with young children. Building on the lessons learned from this exploratory study, we conducted two new experiments examining young children (and adults) abilities to use prosody, in a different sort of globally ambiguous utterances where differences in phrasal prosody were triggered by the syntaxprosody interface and part of the common, default prosody of the sentences (i.e., in compound word versus list reading structures, like ‘guarda-chuva e pato,’ umbrella and duck vs. ‘guarda, chuva e pato’, guard, rain and duck). An eye-tracking paradigm (along the lines of De Carvalho, Dautriche, & Christophe, 2016a) was used to monitor the use of phrasal prosody, namely the contrast between a Prosodic Word boundary (PW) in the compound word interpretation and an Intonational Phrase boundary (IP) in the list interpretation, during auditory sentence processing. An offline pointing task was also included. Results have shown a clear developmental trend in the use of phrasal prosody to guide sentence interpretation, from a general inability at age 4 to a still developing ability at age 5, when local prosodic cues were still not enough and the support of distal cues was necessary to achieve disambiguation, unlike for adults. While the previous experiments investigated the ability to use prosody to constrain lexical and syntactic analysis, thus looking into the combination of lexical, syntactic and prosodic knowledge at a young age, in a final set of experiments, we asked whether phrasal prosody is exploited to chunk the speech signal into words by infants, in the absence of prior lexical knowledge. Using a modified version of the visual habituation paradigm (Altvater-Mackensen & Mani, 2013), we tested 12-month-olds use of phrasal prosody in early word segmentation beyond the utterance edge factor, by examining the effects of two prosodic boundaries in utterance internal position, namely the IP boundary (in the absence of pause) and the PW boundary. Our findings showed that early segmentation abilities are constrained by phrasal prosody, since they crucially depended on the location of the target word in the prosodic structure of the utterance. Implications of the findings in this thesis were discussed in the context of prosodic differences across languages, taking advantage of the atypical combination of prosodic properties that characterizes EP.No âmbito da aquisição da linguagem, a segmentação de fala em unidades e estruturas linguísticas é uma questão central. Esta dissertação examina o papel desempenhado pelo fraseamento prosódico na segmentação de fala, na aquisição do Português Europeu (PE), no que diz respeito ao processamento de frases globalmente ambíguas por parte de crianças aos 4 e 5 anos de idade e à segmentação precoce de palavras aos 12 meses. Estudos anteriores mostraram que o fraseamento prosódico é usado pelos adultos na resolução de ambiguidade, por exemplo, para desambiguar frases sintaticamente ambíguas envolvendo uma interpretação de low ou high attachment de um dado sintagma (e.g.,’Hide the rabbit with a cloth’ Esconde o coelho com um pano). Num estudo exploratório, e dados os resultados pouco claros de trabalhos anteriores para o Português Europeu, investigámos se o fraseamento prosódico poderia guiar a organização da fala em unidades específicas, bem como a interpretação das frases globalmente ambíguas, por parte de participantes adultos. Numa experiência de eye-tracking, que incluía também uma tarefa de apontar, observámos que os participantes adultos do PE não conseguiram usar o fraseamento prosódico para desambiguar as estruturas testadas. Quer os resultados do movimento dos olhos quer os da tarefa de apontar evidenciaram a preferência pelo high attachment na língua, independentemente do fraseamento prosódico envolvido. Estes resultados implicaram compreender melhor a interpretação adulta destes enunciados antes de se conduzir um estudo com crianças. Com base nas observações feitas neste estudo exploratório, conduzimos duas experiências novas por forma a examinar a capacidade de uso da prosódia, por parte das crianças (e adultos), num outro conjunto de enunciados globalmente ambíguos, em que as diferenças de fraseamento prosódico foram desencadeadas pela interface sintaxe-prosódia e por parte da prosódia default das frases (i.e., em compostos versus estruturas em formato de lista, como ‘guarda-chuva e pato,’ vs. ‘guarda, chuva e pato’). Um paradigma de eye-tracking (na linha de De Carvalho, Dautriche, & Christophe, 2016a) foi usado para monitorizar o uso do fraseamento prosódico, nomeadamente o contraste entre uma fronteira de Palavra Prosódica (PW) na interpretação de composto e uma fronteira de Sintagma Entoacional (IP) na interpretação de lista, durante o processamento auditivo da frase. Também foi incluída uma tarefa off-line de apontar. Os resultados mostraram um claro desenvolvimento no uso do fraseamento prosódico na interpretação das frases; de uma incapacidade geral de interpretação das frases aos 4 anos a uma clara evolução nas competências aos 5 anos, altura em que as pistas prosódicas locais ainda são insuficientes e o apoio do contexto prosódico da frase é necessário para alcançar a desambiguação, diferentemente do adulto. Enquanto as experiências anteriores investigaram a capacidade de usar a prosódia para restringir a análise lexical e sintática, olhando para a combinação de conhecimento lexical, sintático e prosódico numa idade precoce, num conjunto final de experiências, questionámos se o fraseamento prosódico é explorado, por parte das crianças, para organizar o sinal de fala em palavras, na ausência de conhecimento lexical prévio. Recorrendo a uma versão modificada do paradigma visual habituation (Altvater-Mackensen & Mani, 2013), testámos o uso do fraseamento prosódico para a segmentação precoce de palavras além do fator limite do enunciado, por parte de crianças com 12 meses de idade. Examinámos o efeito de duas fronteiras prosódicas em posição interna de enunciado, nomeadamente a fronteira de IP (na ausência de pausa) e a fronteira de PW. Os nossos resultados mostraram que a capacidade de segmentação precoce é afetada pelo fraseamento prosódico, na medida em que depende da localização da palavra-alvo na estrutura prosódica do enunciado. Partindo da combinação atípica das propriedades prosódicas que caracterizam o PE, as implicações do conjunto de estudos desenvolvidos no âmbito desta dissertação foram discutidas no contexto das diferenças prosódicas entre línguas

    Computational Approaches to the Syntax–Prosody Interface: Using Prosody to Improve Parsing

    Full text link
    Prosody has strong ties with syntax, since prosody can be used to resolve some syntactic ambiguities. Syntactic ambiguities have been shown to negatively impact automatic syntactic parsing, hence there is reason to believe that prosodic information can help improve parsing. This dissertation considers a number of approaches that aim to computationally examine the relationship between prosody and syntax of natural languages, while also addressing the role of syntactic phrase length, with the ultimate goal of using prosody to improve parsing. Chapter 2 examines the effect of syntactic phrase length on prosody in double center embedded sentences in French. Data collected in a previous study were reanalyzed using native speaker judgment and automatic methods (forced alignment). Results demonstrate similar prosodic splitting behavior as in English in contradiction to the original study’s findings. Chapter 3 presents a number of studies examining whether syntactic ambiguity can yield different prosodic patterns, allowing humans and/or computers to resolve the ambiguity. In an experimental study, humans disambiguated sentences with prepositional phrase- (PP)-attachment ambiguity with 49% accuracy presented as text, and 63% presented as audio. Machine learning on the same data yielded an accuracy of 63-73%. A corpus study on the Switchboard corpus used both prosodic breaks and phrase lengths to predict the attachment, with an accuracy of 63.5% for PP-attachment sentences, and 71.2% for relative clause attachment. Chapter 4 aims to identify aspects of syntax that relate to prosody and use these in combination with prosodic cues to improve parsing. The aspects identified (dependency configurations) are based on dependency structure, reflecting the relative head location of two consecutive words, and are used as syntactic features in an ensemble system based on Recurrent Neural Networks, to score parse hypotheses and select the most likely parse for a given sentence. Using syntactic features alone, the system achieved an improvement of 1.1% absolute in Unlabelled Attachment Score (UAS) on the test set, above the best parser in the ensemble, while using syntactic features combined with prosodic features (pauses and normalized duration) led to a further improvement of 0.4% absolute. The results achieved demonstrate the relationship between syntax, syntactic phrase length, and prosody, and indicate the ability and future potential of prosody to resolve ambiguity and improve parsing

    Weighted error minimization in assigning prosodic structure for synthetic speech

    Get PDF

    Is cue-based memory retrieval \u27good-enough\u27?: Agreement, comprehension, and implicit prosody in native and bilingual speakers of English

    Full text link
    This dissertation focuses on structural and prosodic effects during reading, examining their influence on agreement processing and comprehension in native English (L1) and Spanish-English bilingual (L2) speakers. I consolidate research from three distinct areas of inquiry\u27cognitive processing models, development of reading fluency, and L1/L2 processing strategies\u27and outline a cohesive and comprehensive processing model that can be applied to speakers regardless of language profile. This model is characterized by three critical components: a cognitive model of memory retrieval, a processing paradigm that outlines how resources may be deployed online, and the role of factors such as prosody in parsing decisions. The general framework of this integrated \u27Good-enough Cue\u27 (GC) model assumes the \u27Good-Enough\u27 Hypothesis and cue-based memory retrieval as central aspects. The \u27Good-Enough\u27 Hypothesis states that all speakers have access to two processing routes: a complete syntactic route, and a \u27good enough\u27 heuristic route (Ferreira, Bailey, & Ferraro, 2002; Ferreira, 2003). In the interest of conserving resources, speakers tend to rely more on heuristics and templates whenever the task allows, and may be required to rely on this fallback route when task demand is high. In the proposed GC model, cue-based memory retrieval (CBMR) is the instantiation of the complete syntactic route for agreement and long-distance dependencies in particular (Lewis & Vasishth, 2005; Wagers, Lau, & Phillips, 2009; Wagers, 2008). When retrieval fails using CBMR (due to cue overlap, memory trace decay, or some other factor), comprehenders may compensate by applying a \u27good-enough\u27 processing heuristic, which prioritizes general comprehension over detailed syntactic computation. Prosody (or implicit prosody) may reduce processing load by either facilitating syntactic processing or otherwise assisting memory retrieval, thus reducing reliance on the good-enough fallback route. This investigation explores how text presentation format interacts with these algorithmic versus heuristic processing strategies. Most specifically, measuring whether the presentation format of text affects readers\u27 comprehension and ability to detect subject-verb agreement errors in simple and complex relative clause constructions. The experimental design manipulated text presentation to influence implicit prosody, using sentences designed to induce subject-verb agreement attraction errors. Materials included simple and embedded relative clauses with head nouns and verbs that were either matched or mismatched for number. Participants read items in one of three presentation formats: a) whole sentence, b) word-by-word, or b) phrase-by-phrase, and rated each item for grammaticality and responded to a comprehension probe. Results indicate that while overall comprehension is typically prioritized over grammatical processing (following the \u27Good-Enough\u27 Hypothesis), the effects of presentation format are differentially influential based on group differences and processing measure. For the L1 participants, facilitating the projection of phrasal prosody (phrase-by-phrase presentation) onto text enhances performance in syntactic and grammatical processing, while disrupting it via a word-by-word presentation decreases comprehension accuracy. For the L2 participants however, phrase-by-phrase presentation is not significantly beneficial for grammatical processing\u27even resulting in a decrease in comprehension accuracy. These differences provide insight into the interaction of cognitive taskload, processing strategy selection, and the role of implicit prosody in reading fluency, building toward a comprehensive processing model for speakers of varying language profiles and proficiencies

    A Quantitative Comparative Study of Prosodic and Discourse Units, the Case of French and Taiwan Mandarin

    Get PDF
    International audienceno abstrac
    • …
    corecore