128 research outputs found

    The role of auditory perceptual gestalts on the processing of phrase structure

    Get PDF
    Hierarchical centre embeddings (HCEs) in natural language have been taken as evidence that language is not processed as a finite state system (Chomsky, 1957). While phrase structure may be necessary to produce HCEs, finite state, sequential processing may underlie their comprehension (Frank, Bod, & Christiansen, 2012). Under this account, listeners employ surface level cues (e.g. semantic content) to determine the dependencies within an utterance, instead of processing the words in a hierarchy. The acoustic structure of speech reflects the speaker’s syntactic representation during production (Cooper, Paccia & Lapointe, 1978). In comprehension, temporal (Snedeker & Trueswell, 2003) and pitch (Watson, Tanenhaus, & Gunlogson, 2008) cues rapidly influence processing. Therefore, temporal and pitch variation in speech could contain cues to dependencies. We examine whether grouping behaviour may be driven by Gestalt principles. Temporal proximity suggests that individuals group sequential words that occur closer together in time. Pitch similarity states that individuals group sequential words that are similar in pitch. In this thesis, I examine whether these Gestalts support dependency detection in speech, providing a mechanism through which hierarchical structure can be processed non-hierarchically. In Chapter 3, we assessed whether temporal proximity and pitch similarity explicitly relate to the structure of a corpus of spontaneously produced active and passive relative clauses. This was the case for actives; the embedded clause was preceded by a lengthened pause and a large pitch reduction. For passives, a longer pause and pitch reduction occurred after the verb-phrase of the embedded clause, counter to prediction. The results for actives suggest that temporal proximity and pitch similarity cues could be used to group the phrases of the embedded clause, obviating the need to process hierarchically structured speech hierarchically. Two artificial grammar learning studies assessed whether pitch similarity and temporal proximity cues support the acquisition of phrase structure grammar. Chapter 4 emphasised temporal proximity cues, while chapter 5 emphasised pitch similarity cues. In Chapter 5, pitch similarity cues improved classification performance for structures with two levels of embedding. In both, participants did not benefit from temporal proximity cues. However, the results of a cross-species meta-analysis of artificial grammar learning studies (Chapter 2) raised the possibility that reflection-based measures (e.g. grammaticality judgements) are not well suited for assessing processing-based learning, such as online speech processing (Christiansen, 2018). To properly assess the role of Gestalt cues in speech processing therefore requires processing-based measures. To assess the influence of auditory Gestalts on online speech processing, in Chapter 6 we analysed participants’ gaze behaviour in response to pitch similarity and temporal proximity cues using the visual world paradigm. Participants heard speech-synthesised active-object and passive relative clauses, whilst viewing four potential targets. Each sentence had a prosodic structure consistent with either syntactic form (Chapter 3), or two control prosodic structures. Pitch similarity results indicated that these cues facilitated processing. Temporal proximity cues consistent with syntactic structure did not facilitate processing, instead results suggested a general benefit of increased processing time. Overall, these studies suggest that participants can use the pitch similarity Gestalt to group together syntactically dependent phrases in hierarchical speech, offering a mechanism through which individuals could process hierarchical structures non-hierarchically. The results of Chapters 4, 5, and 6 suggest temporal proximity cues did not facilitate performance to the same extent. Thus, we suggest that unfilled pauses in isolation may be insufficient to facilitate groupings on the basis of temporal proximity

    Working memory capacity and L2 speech production: an exploration study /

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão.Este estudo investiga se há relação entre a capacidade da memória operacional e produção oral em L2 e se esta relação é específica à tarefa de produção da fala ou de natureza geral, independente da tarefa que está sendo desempenhada. Os participantes deste estudo foram 13 alunos de inglês como segunda língua na Universidade de Minnesota. A capacidade de memória operacional foi medida através do speaking span test (Daneman, 1991) e do operation-word span test (Turner & Engle, 1989), ambos aplicados em inglês. Duas tarefas foram usadas para elicitar a produção oral em L2: descrição de uma gravura e narrativa. Quatro aspectos da produção oral foram medidos: fluência, precisão, complexidade e densidade lexical. Análises estatísticas mostram que a capacidade de memória operacional, quando medida pelo speaking span test, se correlaciona de forma positiva com fluência, precisão e complexidade e, de forma negativa, com a densidade lexical, em ambas as tarefas. As análises revelam, também, que o speaking span test pode prever o desempenho oral em L2 nos aspectos de fluência, precisão e complexidade gramatical, explicando parcialmente diferenças de desempenho nestes aspectos. As análises revelam, ainda, que há uma tendência para uma interação entre pausas e hesitações, e entre fluência, precisão, complexidade e densidade lexical durante a produção oral em L2. Por fim, as análises mostram que o operation-word span test sofreu um erro metodológico na sua aplicação, comprometendo, assim, os dados gerados pelo teste. Consequentemente, este estudo não apresenta dados adequados para determinar se a relação entre a capacidade de memória operacional e produção oral em L2 é específica à tarefa em questão ou se é de caráter geral. Para explicar a relação entre a capacidade de memória, quando medida pelo speaking span test, e produção oral em L2, propõe-se que a codificação gramatical é uma sub-tarefa complexa no processo hierárquico de produção da fala que exige o controle e regulação da atenção

    A computational memory and processing model for prosody

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts & Sciences, 1999.Includes bibliographical references (p. 209-226).This thesis links processing in working memory to prosody in speech, and links different working memory capacities to different prosodic styles. It provides a causal account of prosodic differences and an architecture for reproducing them in synthesized speech. The implemented system mediates text-based information through a model of attention and working memory. The main simulation parameter of the memory model quantifies recall. Changing its value changes what counts as given and new information in a text, and therefore determines the intonation with which the text is uttered. Other aspects of search and storage in the memory model are mapped to the remainder of the continuous and categorical features of pitch and timing, producing prosody in three different styles: for small recall values, the exaggerated and sing-song melodies of children's speech; for mid-range values, an adult expressive style; for the largest values, the prosody of a speaker who is familiar with the text, and at times sounds bored or irritated. In addition, because the storage procedure is stochastic, the prosody from simulation to simulation varies, even for identical control parameters. As with with human speech, no two renditions are alike. Informal feedback indicates that the stylistic differences are recognizable and that the prosody is improved over current offerings. A comparison with natural data shows clear and predictable trends although not at significance. However, a comparison within the natural data also did not produce results at significance. One practical contribution of this work is a text mark-up schema consisting of relational annotations to grammatical structures. Another is the product - varied and plausible prosody in synthesized speech. The main theoretical contribution is to show that resource-bound cognitive activity has prosodic correlates, thus providing a rationale for the individual and stylistic differences in melody and rhythm that are ubiquitous in human speech.by Janet Elizabeth Cahn.Ph.D

    Reevaluating the Test Specifications of an Oral Proficiency Test

    Get PDF
    è«–

    The prosodic design of Modern Standard Arabic political monologues

    Get PDF
    The aim of this study is to describe and understand the prosodic design of Modern Standard Arabic (MSA) political monologues. To work towards this aim, we compare two political monologues produced by the same speaker with a broadcast news reading produced by a news announcer. Through comparison of political monologues and broadcast news reading, we highlight linguistic strategies which could be used in any genre of speech, and also what we argue to be persuasive strategies which contribute to the political work of persuasion. We rely on a combination of prosodic, syntactic, and discourse (semantic) evidence to account for linguistic strategies, and on a similar combination of prosodic, syntactic, and discourse (semantics and pragmatics) evidence to account for persuasive strategies, but our primary contribution is highlighting the use of prosody as a persuasive political strategy. A further contribution of this work to the field of knowledge is the elaboration of a set of fine-grained prosodic, syntactic, and discourse structures proposed for broadcast MSA monologues. The prosodic, syntactic, and discourse structures are first labelled independently according to a set of criteria (set out in Chapter 4 Methods). Then, we triangulate the results of labelling the prosodic, syntactic, and discourse structures independently, in Chapters 5-6 leading up to Chapter 7 where the major contribution of this work is highlighted, that is, the use of prosody as a persuasive strategy. The main argument in this work is structured in this gradual way because of the way the process of segmentation is carried out on all three data samples. The process of segmentation starts with identification of abstract forms, and then associates functions to these abstract forms based on detailed explanations of specific linguistic phenomena drawn from the process of triangulation. Therefore, the methodology implemented for broadcast MSA, which can also serve as a methodology for analysing MSA political monologues, is an integral and essential part of the main argument in this thesis

    The London–Lund corpus of spoken English : Description and research

    Get PDF
    • …
    corecore