36 research outputs found

    Ein rhythmisch-prosodisches Modell lyrischen Sprechstils

    Get PDF
    Jeder Mensch hat ein intuitives Verständnis von der Rhythmizität sprachlicher Äußerungen, da er die zugrundeliegenden Prinzipien beim Sprechen stets, wenn auch unreflektiert, beachtet. Der sprachliche Rhythmus nun erfüllt Funktionen im Bereich Gliederung und Hervorhebung auf allen linguistischen Ebenen. Silben- und Wortgrenzen sowie Wortbetonungen werden durch ihn markiert, syntaktische Phrasen und semantisch zusammengehörige Einheiten werden rhythmisch gegliedert. Ohne diese rhythmische Gliederung wäre der Sprachwahrnehmungsprozess sicherlich um einiges erschwert. Neben der direkten Beziehung zwischen Rhythmus und linguistischen Einheiten existieren ferner situationsspezifische Rhythmen, welche mit einem bestimmten Sprechstil verknüpft sind. So hat schnelle Sprache vermutlich einen anderen Rhythmus als langsame, eine Predigt wiederum einen anderen Rhythmus als ein Fußballkommentar. Welche Rolle der Sprechrhythmus auf den einzelnen linguistischen und paralinguistischen Ebenen im Detail spielt, ist jedoch noch weitestgehend ungeklärt. Eine relativ große Übereinstimmung besteht in der Meinung, dass es auch sprachspezifische rhythmische Unterschiede gibt, genauer gesagt gibt es die Annahme, dass sich Sprachen in so genannte akzentzählende und silbenzählende Sprachen unterteilen. Überdies ist der Sprechrhythmus für die Sprachsynthese, d.h. den sprechenden Computer, von Bedeutung. Die häufigste Anwendung in diesem Bereich ist die Überführung von Text in ein entsprechendes Sprachsignal (Text-To-Speech). Auch in den neueren korpusbasierten Synthesesystemen ist eine ausgefeilte Prosodieprädiktion unerlässlich. Diese Vorhersage ist in den meisten Fällen jedoch noch verbesserungswürdig. Leider zielen TTS-Sprachsynthesesysteme im Allgemeinen ausschließlich auf eine prosaische Textdomäne ab, so dass spezielle Domänen, wie etwa die Lyrik, außer Acht gelassen werden. Die vorliegende Arbeit wird unter anderem dadurch motiviert, dass die Grundlagen für eine Erweiterung der Textdomäne von TTS-Systemen um das Feld der Lyrik mit festgelegtem Metrum fehlen und somit erst noch geschaffen werden müssen. Der prominente Rhythmus von Lyrik mit festgelegtem Metrum soll dabei in zwei Dauermodellen abgebildet werden. Es wird angenommen, dass Rückschlüsse von der Rhythmizität der Lyrik im Deutschen auf die deutsche Sprache allgemein möglich sind. Im Rahmen dieser Arbeit wurde in drei Perzeptionsexperimenten die Vorhersageleistung zweier auf der Rhythmizität des lyrischen Sprechstils beruhenden Rhythmusprädiktionsmodelle untersucht. Es wurde hierzu ein 10 Gedichte umfassendes Korpus gelesener Lyrik unter der Brücksichtigung der vier Metren Jambus, Trochäus, Daktylus und Liedform aufgenommen, analysiert und die Dauern der vier berücksichtigten Metren modelliert. Dabei verwandten Schauspieler eine wesentlich lebendigere Prosodie als Laiensprecher. Die Perzeptionsexperimente zeigten, dass Studierende der Phonetik und Hobbymusiker in der Lage waren, auf der Basis delexikaliserter und monotonisierter Stimuli den Sprechstil der Lyrik von dem der Prosa zu unterscheiden. Musiker konnten trotz der Reduziertheit der Stimuli teilweise sogar die vier verschiedenen Metren perzeptiv voneinander unterscheiden. In diesen Experimenten konnte nicht nachgewiesen werden, dass die ebenfalls modellierte Isochronie ein perzeptiv relevantes Konzept darstellt.A rhythmic-prosodic model of poetic speech Everybody holds an intuitive comprehension of the rhythm of speech because they take into account its underlying principles even if this happens in an unconscious way. Speech rhythm has functions in the scope of structuring and emphasis on all linguistic levels. Syllable and word boundaries as well as word stress are marked by it, syntactic phrases and semantically coherent units are rhythmically structured. Without this rhythmic structuring the process of speech perception would surely be made much more difficult. Beside the direct relationship between rhythm and linguistic units, there are additional situation specific rhythms, which are bound to a certain speaking style. Thus, rapid speech presumably has a different rhythm than slow speech, a homily on the other hand a different rhythm than a football commentary, for example. Which role speech rhythm plays on the particular linguistic and para-linguistic levels in detail, however, still remains an open question. There is a relatively big consensus about the opinion that there are also language specific rhythmic differences. To be precise, there is the assumption that languages can be divided into so-called stress-timed and syllable-timed languages. Furthermore, speech rhythm is of importance for speech synthesis, i.e. the speaking computer. The most common application in this field is the transfer of text to the corresponding speech signal (Text-To-Speech). Even in the new corpus-based synthesis systems a sophisticated prosody prediction is vital. In most cases this prediction is still to be improved. Unfortunately, most TTS-systems generally aim at a prosaic text domain, so that special domains like poetry are disregarded. The paper at hand is amongst others motivated by the fact, that the fundamentals of an extension of the text domain of TTS-systems by the area of poetry with a fixed metre are missing and thus still have to be created. The prominent rhythm of poetry with a fixed metre is to be pictured in two duration models. It is assumed that conclusions from the rhythmicality of German poetry to the German language in general are possible. Within the scope of this paper the predictive power of two rhythm prediction models based on the rhythmicality of the poetic speaking style were examined in three listening tests. For this a corpus of spoken poetry comprising 10 poems which take into account the four metres iamb, trochee, dactyl and song were recorded, analysed and the durations of the four considered metres were modelled. Thereby actors used a much livelier prosody than lay speakers. The listening experiments showed that students of phonetics and hobby musicians were able to differentiate between the speaking style of poetry and prose on the basis of delexicalised and monotonised stimuli. In spite of the reduction of the stimuli musicians were partially able to perceptually distinguish the four different metres. By these experiments it could not be proven that the also modelled isochrony depicts a perceptually relevant concept

    The Phonetic Realization of Narrow Focus in English L1 and L2. Data from Production and Perception

    Get PDF
    The typological differences between the two languages are reflected in the strategies adopted to mark sentence-level prominence. While English mark focus by modulating prosodic parameters (namely, pitch, duration and intensity), Italian normally recurs to word order strategies, benefitting from the freer word order admitted by its syntax. This study is aimed to investigate the acquisition of the prosodic marking of narrow non-contrastive focus by Italian speakers of English L2. This study was mainly aimed at: (a) determining and comparing the prosodic cues used by English native speakers and Italian speakers of English L2 when marking narrow focus; (b) verifying if the Italian speakers are able to acquire the English prosodic strategies in focus marking as a function of their competence in English, progressively avoiding the focus marking strategies that characterize their L1 in favor of more native-like solutions; (c) investigating the phenomenon not only at the production level, but also from the point of view of perception. Consequently, this work is composed by a production and a perception study. The production study consisted in the acoustic analysis of native and non-native productions. The speech data were collected using a semi-spontaneous method, where speakers recorded a set of short sentences as replies to wh- questions, with the aim of eliciting sentences presenting narrow focus on subject or on verb. Three groups of speakers were recorded: English native speakers NS), Italian native speakers with a higher competence in English L2 (NNS1), and Italian native speakers with a lower competence in English L2 (NNS2). A similar set of Italian L1 sentences was also elicited from the Italian speakers. The acoustical analysis was performed at sentence and word level, and it was mainly based on the measurement of fundamental frequency and duration. The results confirmed that English native speakers mark narrow focus mainly by modulating pitch. NNS1 showed a progress towards the target model, by implementing an active use of pitch, although not perfectly matching with the native one. Finally, NNS2 were not able to mark focus with the use of prosodic parameters. The analysis of the Italian L1 data set suggested that in Italian narrow non-contrastive focus is not marked prosodically. Not even duration, which in Italian is the prosodic cue normally used to mark prominence at word level seems to play a role in signaling prominence at sentence level. The perception study was designed to verify whether the differences shown by the acoustical measurements could also have an impact on the listeners' perception. Two perception tests were designed, based on a two-alternative forced-choice paradigm, where listeners were asked to identify narrow focus by guessing the wh- question that had triggered each sentence. Experiment 1 presented natural sentences to two groups of listeners: 22 British native speakers and 22 Italian native listeners. The Italian native listeners were also presented with an extra set of stimuli, consisting of the Italian L1 data set. The results of Experiment 1 showed that English native listeners could correctly identify narrow focus even without extra contextual information. This happened for NS and NNS1, whereas the listeners could not recognize focus in the productions by NNS2. The Italian listeners could also detect focus well above chance level in the productions by NS. However, they failed to identify focus in the productions by NNS1 and NNS2. As for the Italian L1 data set, the Italian listeners failed to distinguish narrow focus, providing perceptual evidence to the hypothesis that Italians do not mark narrow focus by prosody. Experiment 2 was designed to investigate the effect of the differences in pitch modulation on the correct detection of narrow focus by English native listeners. In this case, the productions of the speakers were acoustically manipulated. The participants were 20 British English native speakers. In general, the results of Experiment 2 confirmed that pitch plays an important role in the recognition of narrow focus also from the perceptual point of view. This is particularly true for NS productions, while the listeners could not successfully identify focus in the modified non-native productions. The results of the production study and the perception study converged in showing that in English pitch plays an important role in the production and perception of narrow non-contrastive focus. As for non-native productions, NNS1 could approach the native model to a certain extent by modulating "FO". From the perceptual point of view, their productions were effective enough to be successfully understood by English native listeners. In contrast, NNS2 had not managed to adopt the strategies of English, showing a poor prosodic characterization of the constituent in focus. As a consequence, the listeners could not identify focus in the NNS2 productions. These findings are particularly interesting not only for research in L2 phonetics, but also for their implications for language instruction, where prosody has only recently started to be studied and taught with renewed interest and momentum

    Prosody analysis and modeling for Cantonese text-to-speech.

    Get PDF
    Li Yu Jia.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references.Abstracts in English and Chinese.Chapter Chapter 1 --- Introduction --- p.1Chapter 1.1. --- TTS Technology --- p.1Chapter 1.2. --- Prosody --- p.2Chapter 1.2.1. --- What is Prosody --- p.2Chapter 1.2.2. --- Prosody from Different Perspectives --- p.3Chapter 1.2.3. --- Acoustical Parameters of Prosody --- p.3Chapter 1.2.4. --- Prosody in TTS --- p.5Chapter 1.2.4.1 --- Analysis --- p.5Chapter 1.2.4.2 --- Modeling --- p.6Chapter 1.2.4.3 --- Evaluation --- p.6Chapter 1.3. --- Thesis Objectives --- p.7Chapter 1.4. --- Thesis Outline --- p.7Reference --- p.8Chapter Chapter 2 --- Cantonese --- p.9Chapter 2.1. --- The Cantonese Dialect --- p.9Chapter 2.1.1. --- Phonology --- p.10Chapter 2.1.1.1 --- Initial --- p.11Chapter 2.1.1.2 --- Final --- p.12Chapter 2.1.1.3 --- Tone --- p.13Chapter 2.1.2. --- Phonological Constraints --- p.14Chapter 2.2. --- Tones in Cantonese --- p.15Chapter 2.2.1. --- Tone System --- p.15Chapter 2.2.2. --- Linguistic Significance --- p.18Chapter 2.2.3. --- Acoustical Realization --- p.18Chapter 2.3. --- Prosodic Variation in Continuous Cantonese Speech --- p.20Chapter 2.4. --- Cantonese Speech Corpus - CUProsody --- p.21Reference --- p.23Chapter Chapter 3 --- F0 Normalization --- p.25Chapter 3.1. --- F0 in Speech Production --- p.25Chapter 3.2. --- F0 Extraction --- p.27Chapter 3.3. --- Duration-normalized Tone Contour --- p.29Chapter 3.4. --- F0 Normalization --- p.30Chapter 3.4.1. --- Necessity and Motivation --- p.30Chapter 3.4.2. --- F0 Normalization --- p.33Chapter 3.4.2.1 --- Methodology --- p.33Chapter 3.4.2.2 --- Assumptions --- p.34Chapter 3.4.2.3 --- Estimation of Relative Tone Ratios --- p.35Chapter 3.4.2.4 --- Derivation of Phrase Curve --- p.37Chapter 3.4.2.5 --- Normalization of Absolute FO Values --- p.39Chapter 3.4.3. --- Experiments and Discussion --- p.39Chapter 3.5. --- Conclusions --- p.44Reference --- p.45Chapter Chapter 4 --- Acoustical FO Analysis --- p.48Chapter 4.1. --- Methodology of FO Analysis --- p.48Chapter 4.1.1. --- Analysis-by-Synthesis --- p.48Chapter 4.1.2. --- Acoustical Analysis --- p.51Chapter 4.2. --- Acoustical FO Analysis for Cantonese --- p.52Chapter 4.2.1. --- Analysis of Phrase Curves --- p.52Chapter 4.2.2. --- Analysis of Tone Contours --- p.55Chapter 4.2.2.1 --- Context-independent Single-tone Contours --- p.56Chapter 4.2.2.2 --- Contextual Variation --- p.58Chapter 4.2.2.3 --- Co-articulated Tone Contours of Disyllabic Word --- p.59Chapter 4.2.2.4 --- Cross-word Contours --- p.62Chapter 4.2.2.5 --- Phrase-initial Tone Contours --- p.65Chapter 4.3. --- Summary --- p.66Reference --- p.67Chapter Chapter5 --- Prosody Modeling for Cantonese Text-to-Speech --- p.70Chapter 5.1. --- Parametric Model and Non-parametric Model --- p.70Chapter 5.2. --- Cantonese Text-to-Speech: Baseline System --- p.72Chapter 5.2.1. --- Sub-syllable Unit --- p.72Chapter 5.2.2. --- Text Analysis Module --- p.73Chapter 5.2.3. --- Acoustical Synthesis --- p.74Chapter 5.2.4. --- Prosody Module --- p.74Chapter 5.3. --- Enhanced Prosody Model --- p.74Chapter 5.3.1. --- Modeling Tone Contours --- p.75Chapter 5.3.1.1 --- Word-level FO Contours --- p.76Chapter 5.3.1.2 --- Phrase-initial Tone Contours --- p.77Chapter 5.3.1.3 --- Tone Contours at Word Boundary --- p.78Chapter 5.3.2. --- Modeling Phrase Curves --- p.79Chapter 5.3.3. --- Generation of Continuous FO Contours --- p.81Chapter 5.4. --- Summary --- p.81Reference --- p.82Chapter Chapter 6 --- Performance Evaluation --- p.83Chapter 6.1. --- Introduction to Perceptual Test --- p.83Chapter 6.1.1. --- Aspects of Evaluation --- p.84Chapter 6.1.2. --- Methods of Judgment Test --- p.84Chapter 6.1.3. --- Problems in Perceptual Test --- p.85Chapter 6.2. --- Perceptual Tests for Cantonese TTS --- p.86Chapter 6.2.1. --- Intelligibility Tests --- p.86Chapter 6.2.1.1 --- Method --- p.86Chapter 6.2.1.2 --- Results --- p.88Chapter 6.2.1.3 --- Analysis --- p.89Chapter 6.2.2. --- Naturalness Tests --- p.90Chapter 6.2.2.1 --- Word-level --- p.90Chapter 6.2.2.1.1 --- Method --- p.90Chapter 6.2.2.1.2 --- Results --- p.91Chapter 6.2.3.1.3 --- Analysis --- p.91Chapter 6.2.2.2 --- Sentence-level --- p.92Chapter 6.2.2.2.1 --- Method --- p.92Chapter 6.2.2.2.2 --- Results --- p.93Chapter 6.2.2.2.3 --- Analysis --- p.94Chapter 6.3. --- Conclusions --- p.95Chapter 6.4. --- Summary --- p.95Reference --- p.96Chapter Chapter 7 --- Conclusions and Future Work --- p.97Chapter 7.1. --- Conclusions --- p.97Chapter 7.2. --- Suggested Future Work --- p.99Appendix --- p.100Appendix 1 Linear Regression --- p.100Appendix 2 36 Templates of Cross-word Contours --- p.101Appendix 3 Word List for Word-level Tests --- p.102Appendix 4 Syllable Occurrence in Word List of Intelligibility Test --- p.108Appendix 5 Wrongly Identified Word List --- p.112Appendix 6 Confusion Matrix --- p.115Appendix 7 Unintelligible Word List --- p.117Appendix 8 Noisy Word List --- p.119Appendix 9 Sentence List for Naturalness Test --- p.12

    Data-Driven Enhancement of State Mapping-Based Cross-Lingual Speaker Adaptation

    Get PDF
    The thesis work was motivated by the goal of developing personalized speech-to-speech translation and focused on one of its key component techniques – cross-lingual speaker adaptation for text-to-speech synthesis. A personalized speech-to-speech translator enables a person’s spoken input to be translated into spoken output in another language while maintaining his/her voice identity. Before addressing any technical issues, work in this thesis set out to understand human perception of speaker identity. Listening tests were conducted in order to determine whether people could differentiate between speakers when they spoke different languages. The results demonstrated that differentiating between speakers across languages was an achievable task. However, it was difficult for listeners to differentiate between speakers across both languages and speech types (original recordings versus synthesized samples). The underlying challenge in cross-lingual speaker adaptation is how to apply speaker adaptation techniques when the language of adaptation data is different from that of synthesis models. The main body of the thesis work was devoted to the analysis and improvement of HMM state mapping-based cross-lingual speaker adaptation. Firstly, the effect of unsupervised cross-lingual adaptation was investigated, as it relates to the application scenario of personalized speech-to-speech translation. The comparison of paired supervised and unsupervised systems shows that the performance of unsupervised cross-lingual speaker adaptation is comparable to that of the supervised fashion, even if the average phoneme error rate of the unsupervised systems is around 75%. Then the effect of the language mismatch between synthesis models and adaptation data was investigated. The mismatch is found to transfer undesirable language information from adaptation data to synthesis models, thereby limiting the effectiveness of generating multiple regression class-specific transforms, using larger quantities of adaptation data and estimating adaptation transforms iteratively. Thirdly, in order to tackle the problems caused by the language mismatch, a data-driven adaptation framework using phonological knowledge is proposed. Its basic idea is to group HMM states according to phonological knowledge in a data-driven manner and then to map each state to a phonologically consistent counterpart in a different language. This framework is also applied to regression class tree construction for transform estimation. It is found that the proposed framework alleviates the negative effect of the language mismatch and gives consistent improvement compared to previous state-of-the-art approaches. Finally, a two-layer hierarchical transformation framework is developed, where one layer captures speaker characteristics and the other compensates for the language mismatch. The most appropriate means to construct the hierarchical arrangement of transforms was investigated in an initial study. While early results show some promise, further in-depth investigation is needed to confirm the validity of this hierarchy

    Dynamic aspects of speech and intonation in Brazilian Portuguese

    Get PDF
    Orientador: Plinio Almeida BarbosaTese (doutorado) - Universidade Estadual de Campinas, Instituto de Estudos da LinguagemResumo: Esta tese explora a relação entre padrões entoacionais ritmo e discurso de acordo com o programa de investigação dos sistemas dinâmicos. O estudo dessas relações foram feitas tendo como base o Modelo Dinâmico do Ritmo da Fala, proposto por Barbosa (2006), o Sistema DaTo de notação entoacional, proposto por Lucente (2008) e o Modelo Computacional da Estrutura do Discurso, proposto por Grosz & Sidner (1986). O Modelo de Dinâmico do Ritmo sugere que o ritmo da fala seja resultado da ação de dois osciladores - um acentual e outro silábico - que ao receberem na entrada do sistema informações de níveis lingüísticos superiores e de uma pauta gestual, geram a duração gestual na saída. A hipótese desta tese é que, paralelamente a esses osciladores, um oscilador glotal possa agir controlando os padrões entoacionais da fala. Os padrões, ou ciclos entoacionais, em que se organiza a entoação do PB emergem quando relacionados à segmentação de trechos de discurso em modalidade espontânea. Para cada trecho de fala classificado como espontâneo de acordo com um critério proposto nesta tese, o discurso é segmentado no sistema DaTo em unidades linguisticamente estruturadas, que contém os propósitos de comunicar e atrair atenção. Cada um destes segmentos do discurso se alinham a um padrão entoacional iniciado por um contorno entoacional ascendente (LH ou >LH) e finalizado por um contorno descendente (LHL) ou por um nível de fronteira baixo (L). Alinhado a este padrão formado entre entoação e discurso está também o ritmo. Com o acréscimo de uma camada no sistema DaTo para a segmentação dos enunciados em grupos acentuais pôde-se observar o alinhamento entre a segmentação dos grupos acentuais e a notação dos contornos entoacionais coincidindo com fronteiras das unidades discursivas. A observação do alinhamento entre entoação, ritmo e discurso tendo como atratores os grupos acentuais possibilitou a proposta de inserção de um oscilador glotal ao Modelo Dinâmico do RitmoAbstract: This thesis explores the relationship between intonational patterns and its relationship with speech rhythm and discourse, according to the dynamic systems research program. The study of these relationships were based on Barbosa's (2006) Dynamic Model of Speech Rhythm; on DaTo intonational annotation system proposed by Lucente (2008); and on the Computational Model of the Structure of Discourse, proposed by Grosz & Sidner (1986). The Dynamic Model of Rhythm suggests that speech rhythm is the result of two oscillators action - accentual and syllabic - which receive linguistic and gestural information as input, and give the gestural duration as output. This thesis hypothesis is that in addition to these oscillators, a glottal oscillator can act controlling the intonation patterns of speech. These patterns, or intonational cycles, which organize the BP intonation, emerge when related to the spontaneous discourse segmentation. For each discourse segment classified as spontaneous, according to a criteria proposed in this thesis, the speech is segmented into the DaTo system framework in linguistically structured units, which contains the purposes of communication and attention. Each of these segments is aligned to the speech intonation pattern delimitated by a rising contour (LH or> HL) at the beginning and by a falling contour (LHL), or a boundary level (L), at the end. The speech rhythm is also aligned to the pattern formed between intonation and discourse. By the inclusion of a new layer for the stress groups segmentation into DaTo system was possible to observe the alignment between stress group segmentation and intonational annotation coinciding with discourse segments boundaries. The alignment between intonation, rhythm and discourse, having the stress groups as attractors, allowed us to propose the insertion of a glottal oscillator into the Dynamic Model of RhythmDoutoradoDoutora em Linguístic

    The vocalisations and behaviour of chickens in anticipation of rewards

    Get PDF

    The role of auditory perceptual gestalts on the processing of phrase structure

    Get PDF
    Hierarchical centre embeddings (HCEs) in natural language have been taken as evidence that language is not processed as a finite state system (Chomsky, 1957). While phrase structure may be necessary to produce HCEs, finite state, sequential processing may underlie their comprehension (Frank, Bod, & Christiansen, 2012). Under this account, listeners employ surface level cues (e.g. semantic content) to determine the dependencies within an utterance, instead of processing the words in a hierarchy. The acoustic structure of speech reflects the speaker’s syntactic representation during production (Cooper, Paccia & Lapointe, 1978). In comprehension, temporal (Snedeker & Trueswell, 2003) and pitch (Watson, Tanenhaus, & Gunlogson, 2008) cues rapidly influence processing. Therefore, temporal and pitch variation in speech could contain cues to dependencies. We examine whether grouping behaviour may be driven by Gestalt principles. Temporal proximity suggests that individuals group sequential words that occur closer together in time. Pitch similarity states that individuals group sequential words that are similar in pitch. In this thesis, I examine whether these Gestalts support dependency detection in speech, providing a mechanism through which hierarchical structure can be processed non-hierarchically. In Chapter 3, we assessed whether temporal proximity and pitch similarity explicitly relate to the structure of a corpus of spontaneously produced active and passive relative clauses. This was the case for actives; the embedded clause was preceded by a lengthened pause and a large pitch reduction. For passives, a longer pause and pitch reduction occurred after the verb-phrase of the embedded clause, counter to prediction. The results for actives suggest that temporal proximity and pitch similarity cues could be used to group the phrases of the embedded clause, obviating the need to process hierarchically structured speech hierarchically. Two artificial grammar learning studies assessed whether pitch similarity and temporal proximity cues support the acquisition of phrase structure grammar. Chapter 4 emphasised temporal proximity cues, while chapter 5 emphasised pitch similarity cues. In Chapter 5, pitch similarity cues improved classification performance for structures with two levels of embedding. In both, participants did not benefit from temporal proximity cues. However, the results of a cross-species meta-analysis of artificial grammar learning studies (Chapter 2) raised the possibility that reflection-based measures (e.g. grammaticality judgements) are not well suited for assessing processing-based learning, such as online speech processing (Christiansen, 2018). To properly assess the role of Gestalt cues in speech processing therefore requires processing-based measures. To assess the influence of auditory Gestalts on online speech processing, in Chapter 6 we analysed participants’ gaze behaviour in response to pitch similarity and temporal proximity cues using the visual world paradigm. Participants heard speech-synthesised active-object and passive relative clauses, whilst viewing four potential targets. Each sentence had a prosodic structure consistent with either syntactic form (Chapter 3), or two control prosodic structures. Pitch similarity results indicated that these cues facilitated processing. Temporal proximity cues consistent with syntactic structure did not facilitate processing, instead results suggested a general benefit of increased processing time. Overall, these studies suggest that participants can use the pitch similarity Gestalt to group together syntactically dependent phrases in hierarchical speech, offering a mechanism through which individuals could process hierarchical structures non-hierarchically. The results of Chapters 4, 5, and 6 suggest temporal proximity cues did not facilitate performance to the same extent. Thus, we suggest that unfilled pauses in isolation may be insufficient to facilitate groupings on the basis of temporal proximity

    Writing, Medium, Machine: Modern Technographies

    Get PDF
    Writing, Medium, Machine: Modern Technographies is a collection of thirteen essays by leading scholars which explores the mutual determination of forms of writing and forms of technology in modern literature. The essays unfold from a variety of historical and theoretical perspectives the proposition that literature is not less but more mechanical than other forms of writing: a transfigurative ideal machine. The collection breaks new ground archaeologically, unearthing representations in literature and film of a whole range of decisive technologies from the stereopticon through census-and slot-machines to the stock ticker, and from the Telex to the manipulation of genetic code and the screens which increasingly mediate our access to the world and to each other. It also contributes significantly to critical and cultural theory by investigating key concepts which articulate the relation between writing and technology: number, measure, encoding, encryption, the archive, the interface. Technography is not just a modern matter, a feature of texts that happen to arise in a world full of machinery and pay attention to that machinery in various ways. But the mediation of other machines has beyond doubt assisted literature to imagine and start to become the ideal machine it is always aspiring to be. Contributors: Ruth Abbott, John Attridge, Kasia Boddy, Mark Byron, Beci Carver, Steven Connor, Esther Leslie, Robbie Moore, Julian Murphet, James Purdon, Sean Pryor, Paul Sheehan, Kristen Treen
    corecore