Search CORE

1,363 research outputs found

Development of audiovisual comprehension skills in prelingually deaf children with cochlear implants

Author: Bergeson Tonya R.
Davis Rebecca A. O.
Pisoni David B.
Publication venue: Digital Commons @ Butler University
Publication date: 01/01/2005
Field of study

Objective: The present study investigated the development of audiovisual comprehension skills in prelingually deaf children who received cochlear implants. Design: We analyzed results obtained with the Common Phrases (Robbins et al., 1995) test of sentence comprehension from 80 prelingually deaf children with cochlear implants who were enrolled in a longitudinal study, from pre-implantation to 5 years after implantation. Results: The results revealed that prelingually deaf children with cochlear implants performed better under audiovisual (AV) presentation compared with auditory-alone (A-alone) or visual-alone (V-alone) conditions. AV sentence comprehension skills were found to be strongly correlated with several clinical outcome measures of speech perception, speech intelligibility, and language. Finally, pre-implantation V-alone performance on the Common Phrases test was strongly correlated with 3-year postimplantation performance on clinical outcome measures of speech perception, speech intelligibility, and language skills. Conclusions: The results suggest that lipreading skills and AV speech perception reflect a common source of variance associated with the development of phonological processing skills that is shared among a wide range of speech and language outcome measures

CiteSeerX

PubMed Central

Digital Commons @ Butler University

The listening talker: A review of human and algorithmic context-induced modifications of speech

Author: Adriaans
Albin
Alcántara
Andruski
ANSI S3.5-1997
Arai
Assmann
Assmann
Aubanel
Aubanel
Aubanel
Babel
Babel
Bailly
Baran
Barker
Batliner
Beautemps
Beckford Wassink
Beckman
Beckman
Bele
Bell
Benoit
Best
Biersack
Bird
Blamey
Boike
Bond
Bond
Bond
Boril
Bradlow
Bradlow
Bradlow
Bradlow
Branigan
Bregman
Bronkhorst
Brungart
Brungart
Brunskog
Burnham
Burnham
Burnham
Burnham
Castellanos
Chen
Cheskin
Cheyne
Chládková
Chung
Church
Cole
Cooke
Cooke
Cooke
Cooke
Cooke
Cooke
Cooper
Cooper
Cox
Cox
Cristia
Cristià
Cutler
Darwin
Dau
Davis
Davis
Dejonckere
Delvaux
Dodane
Dreher
Dudley
Dunst
Egan
Englund
Eriksson
Erting
Estival
Falk
Farris
Ferguson
Ferguson
Fernald
Fernald
Fernald
Fernald
Fernald
Field
Fisher
Fisher
Fitzpatrick
Floccia
Fogerty
Fogerty
Fowler
Fowler
Freed
Fux
Fux
Fux
Gagne
Gagne
Gagne
Galati
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garnier
Garrod
Giles
Goldwater
Golinkoff
Golinkoff
Gordon-Salant
Granlund
Granlund
Green
Grieser
Hawley
Hazan
Hazan
Hazan
Hazan
Healey
Helfer
Helfer
Hornsby
Horwitz
Howell
Imaizumi
Imaizumi
Ishizuka
Janarthanam
Johnson
Jun
Jung
Junqua
Junqua
Junqua
Kadiri
Kang
Kaplan
Kappes
Kawahara
Kewley-Port
Kim
Kim
Kirchhoff
Kitamura
Kitamura
Kondaurova
Kondaurova
Korn
Krause
Krause
Krause
Krause
Krause
Kretsinger
Kryter
Kuhl
Kusumoto
Lam
Lane
Laures
Laures
Lee
Lienard
Lindblom
Lindblom
Little
Liu
Liu
Liu
Lombard
Long
Long
Lu
Lu
Lu
Malsheen
Maniwa
Marin
Martin Cooke
Masataka
Matthies
Mattys
Mattys
Mattys
Maye
Maye
Mayo
Maëva Garnier
Metz
Michael
Miller
Mokbel
Monsen
Montgomery
Moon
Moon
Moore
Moore
Moulines
Naoi
Natale
Nejime
Newport
Niederjohn
Niwano
Niwano
Ostroff
Oviatt
Owren
Papoušek
Papoušek
Papoušek
Pardo
Patel
Patel
Payne
Payton
Pegg
Pelegrín-García
Perkell
Petkov
Peutz
Phillips
Picheny
Picheny
Picheny
Pickering
Pickett
Pickett
Pisoni
Pittman
Pollack
Pucher
Pye
Rasetshwane
Ratner
Ratner
Ratner
Rieser
Rogers
Rostolland
Rostolland
Ryan
Räsänen
Sachs
Sankowska
Sauert
Scarborough
Schmitt
Schulman
Schum
Shimron
Simon King
Sims
Singh
Skowronski
Smiljanic
Smith
Snow
Song
Stanton
Stern
Stilp
Stylianou
Summers
Summers
Sundberg
Sundberg
Sundberg
Suni
Synnestvedt
Taal
Taal
Tang
Tang
Tang
Tartter
Ternström
Thanavisuth
Titze
Torick
Trainor
Trainor
Traunmuller
Uchanski
Uchanski
Uther
Valentini-Botinhao
Valentini-Botinhao
Valian
Valian
van de Weijer
van Rooij
Vatikiotis-Bateson
Villegas
Vincent Aubanel
Vitevitch
Wang
Warner
Warren
Watson
Webster
Welby
Welby
Werker
World Health Organisation
Xu
Xu
Yamagishi
Yang
Yoo
Zajdó
Zampini
Zangl
Zhao
Zipf
Zorilă
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

Crossref

Hal - Université Grenoble Alpes

Edinburgh Research Explorer

Western Sydney ResearchDirect

Experience with foreign accent influences non-native (L2) word recognition: The case of th-substitutions [Abstract]

Author: Hanulikova A.
Weber A.
Publication venue
Publication date: 01/04/2009
Field of study

MPG.PuRe

Recommended from our members

Developmental and cultural factors of audiovisual speech perception in noise

Author: Reetzke Rachel Denise
Publication venue
Publication date: 16/09/2014
Field of study

textThe aim of this project is two-fold: 1) to investigate developmental differences in intelligibility gains from visual cues in speech perception-in-noise, and 2) to examine how different types of maskers modulate visual enhancement across age groups. A secondary aim of this project is to investigate whether or not bilingualism differentially modulates audiovisual integration during speech in noise tasks. To that end, both child and adult, monolingual and bilingual participants completed speech perception in noise tasks through three within-subject variables: (1) masker type: pink noise or two-talker babble, (2) modality: audio-only (AO) and audiovisual (AV), and (3) Signal-to-noise ratio (SNR): 0 dB, -4 dB, -8 dB, -12 dB, and -16 dB. The findings revealed that, although both children and adults benefited from visual cues in speech-in-noise tasks, adults showed greater benefit at lower SNRs. Moreover, although child monolingual and bilingual participants performed comparably across all conditions, monolingual adults outperformed simultaneous bilingual adult participants. These results may indicate that the divergent use of visual cues in speech perception between bilingual and monolingual speakers occurs later in development.Communication Sciences and Disorder

Texas ScholarWorks

Seeing a talking face matters to infants, children and adults : behavioural and neurophysiological studies

Author: Tan Sok Hui (Jessica)
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2020
Field of study

Everyday conversations typically occur face-to-face. Over and above auditory information, visual information from a speaker’s face, e.g., lips, eyebrows, contributes to speech perception and comprehension. The facilitation that visual speech cues bring— termed the visual speech benefit—are experienced by infants, children and adults. Even so, studies on speech perception have largely focused on auditory-only speech leaving a relative paucity of research on the visual speech benefit. Central to this thesis are the behavioural and neurophysiological manifestations of the visual speech benefit. As the visual speech benefit assumes that a listener is attending to a speaker’s talking face, the investigations are conducted in relation to the possible modulating effects that gaze behaviour brings. Three investigations were conducted. Collectively, these studies demonstrate that visual speech information facilitates speech perception, and this has implications for individuals who do not have clear access to the auditory speech signal. The results, for instance the enhancement of 5-month-olds’ cortical tracking by visual speech cues, and the effect of idiosyncratic differences in gaze behaviour on speech processing, expand knowledge of auditory-visual speech processing, and provide firm bases for new directions in this burgeoning and important area of research

Western Sydney ResearchDirect

Unsupervised syntactic chunking with acoustic cues: Computational models for prosodic bootstrapping

Author: Goldwater Sharon
Pate John K.
Publication venue
Publication date: 01/01/2011
Field of study

Learning to group words into phrases without supervision is a hard task for NLP systems, but infants routinely accomplish it. We hypothesize that infants use acoustic cues to prosody, which NLP systems typically ignore. To evaluate the utility of prosodic information for phrase discovery, we present an HMM-based unsupervised chunker that learns from only transcribed words and raw acoustic correlates to prosody. Unlike previous work on unsupervised parsing and chunking, we use neither gold standard part-of-speech tags nor punctuation in the input. Evaluated on the Switchboard corpus, our model outperforms several baselines that exploit either lexical or prosodic information alone, and, despite producing a flat structure, performs competitively with a state-of-the-art unsupervised lexicalized parser, with a substantial advantage in precision. Our results support the hypothesis that acoustic-prosodic cues provide useful evidence about syntactic phrases for language-learning infants.10 page(s

Edinburgh Research Explorer

Macquarie University ResearchOnline

Hard to say, hard to see? Speech-in-noise discrimination at different levels of sensorimotor proficiency

Author: Lorenzini Irene
Publication venue: 'Scuola Normale Superiore - Edizioni della Normale'
Publication date: 24/09/2018
Field of study

Archivio istituzionale della Ricerca - Scuola Normale Superiore

The phylogenetic origin and mechanism of sound symbolism - the role of action-perception circuits

Author: Margiotoudi Konstantina
Publication venue
Publication date: 01/01/2021
Field of study

As opposed to the classic Saussurean view on the arbitrary relationship between linguistic form and meaning, non-arbitrariness is a pervasive feature in human language. Sound symbolism—namely, the intrinsic relationship between meaningless speech sounds and visual shapes—is a typical case of non-arbitrariness. A demonstration of sound symbolism is the “maluma-takete” effect, in which immanent links are observed between meaningless ‘round’ or ‘sharp’ speech sounds (e.g., maluma vs. takete) and round or sharp abstract visual shapes, respectively. An extensive amount of empirical work suggests that these mappings are shared by humans and play a distinct role in the emergence and acquisition of language. However, important questions are still pending on the origins and mechanism of sound symbolic processing. Those questions are addressed in the present work. The first part of this dissertation focuses on the validation of sound symbolic effects in a forced choice task, and on the interaction of sound symbolism with two crossmodal mappings shared by humans. To address this question, human subjects were tested with a forced choice task on sound symbolic mappings crossed with two crossmodal audiovisual mappings (pitch-shape and pitch-spatial position). Subjects performed significantly above chance only for the sound symbolic associations but not for the other two mappings. Sound symbolic effects were replicated, while the other two crossmodal mappings involving low-level audiovisual properties, such as pitch and spatial position, did not emerge. The second issue examined in the present dissertation are the phylogenetic origins of sound symbolic associations. Human subjects and a group of touchscreen trained great apes were tested with a forced choice task on sound symbolic mappings. Only humans were able to process and/or infer the links between meaningless speech sounds and abstract shapes. These results reveal, for the first time, the specificity of humans’ sound symbolic ability, which can be related to neurobiological findings on the distinct development and connectivity of the human language network. The last part of the dissertation investigates whether action knowledge and knowledge of the perceptual outputs of our actions can provide a possible explanation of sound symbolic mappings. In a series of experiments, human subjects performed sound symbolic mappings, and mappings of ‘round’ or ‘sharp’ hand actions sounds with the shapes produced by these hand actions. In addition, the auditory and visual stimuli of both conditions were crossed. Subjects significantly detected congruencies for all mappings, and most importantly, a positive correlation was observed in their performances across conditions. Physical acoustic and visual similarities between the audiovisual byproducts of our hand actions with the sound symbolic pseudowords and shapes show that the link between meaningless speech sounds and abstract visual shapes is found in action knowledge. From a neurobiological perspective the link between actions and the audiovisual by-products of our actions is also in accordance with distributed action perception circuits in the human brain. Action-perception circuits, supported by the human neuroanatomical connectivity between auditory, visual, and motor cortices, and under associative learning, emerge and carry the perceptual and motor knowledge of our actions. These findings give a novel explanation for how symbolic communication is linked to our sensorimotor experiences. To sum up, the present dissertation (i) validates the presence of sound symbolic effects in a forced choice task, (ii) shows that sound symbolic ability is specific to humans, and (iii) that action knowledge can provide the mechanistic glue of mapping meaningless speech sounds to abstract shapes. Overall, the present work contributes to a better understanding of the phylogenetic origins and mechanism of sound symbolic ability in humans.Im Gegensatz zur klassischen Saussureschen Ansicht über die willkürliche Beziehung zwischen sprachlicher Form und Bedeutung ist die Nichtwillkürlichkeit ein durchdringendes Merkmal der menschlichen Sprache. Lautsymbolik—nämlich die intrinsische Beziehung zwischen bedeutungslosen Sprachlauten und visuellen Formen—ist ein typischer Fall von Nichtwillkürlichkeit. Ein Beispiel für Klangsymbolik ist der “malumatakete” Effekt, bei dem immanente Verbindungen zwischen bedeutungslosen ‘runden’ oder ‘scharfen’ Sprachlauten (z.B. maluma vs. takete) und runden bzw. scharfen abstrakten visuellen Formen beobachtet werden. Umfangreiche empirische Arbeiten legen nahe, dass diese Zuordnungen von Menschen vorgenommen werden und bei der Entstehung und dem Erwerb von Sprache eine besondere Rolle spielen. Wichtige Fragen zu Ursprung und Mechanismus der Verarbeitung von Lautsymbolen sind jedoch noch offen. Diese Fragen werden in der vorliegenden Arbeit behandelt. Der erste Teil dieser Dissertation konzentriert sich auf die Validierung von klangsymbolischen Effekten in einer Forced-Choice-Auswahlaufgabe (erzwungene Wahl) und auf die Interaktion von Klangsymbolik mit zwei crossmodalen Mappings, die von Menschen vorgenommen werden. Um dieser Frage nachzugehen, wurden menschliche Probanden mit einer Auswahlaufgabe mit zwei Auswahlmöglichkeiten auf klangsymbolische Zuordnungen getestet , die mit zwei crossmodalen audiovisuellen Zuordnungen (Tonhöhenform und Tonhöhen-Raum-Position) gekreuzt wurden. Die Versuchspersonen erbrachten nur bei den klangsymbolischen Assoziationen eine signifikant über dem Zufall liegende Leistung, nicht aber bei den beiden anderen Zuordnungen. Tonsymbolische Effekte wurden repliziert, während die beiden anderen crossmodalen Zuordnungen, die audiovisuelle Eigenschaften auf niedriger Ebene wie Tonhöhe und räumliche Position beinhalteten, nicht auftraten. Das zweite Thema, das in der vorliegenden Dissertation untersucht wird, sind die phylogenetischen Ursprünge der klangsymbolischen Assoziationen. Menschliche Versuchspersonen und eine Gruppe von Menschenaffen, die auf Touchscreens trainiert wurden, wurden mit einer Forced-Choice-Aufgabe auf klangsymbolische Zuordnungen getestet. Nur Menschen waren in der Lage, die Verbindungen zwischen bedeutungslosen Sprachlauten und abstrakten Formen zu verarbeiten und/oder abzuleiten. Diese Ergebnisse zeigen zum ersten Mal die Spezifität der lautsymbolischen Fähigkeit des Menschen, die mit neurobiologischen Erkenntnissen über die ausgeprägte Entwicklung und Konnektivität des menschlichen Sprachnetzwerks in Verbindung gebracht werden kann. Der letzte Teil der Dissertation untersucht darüber hinaus, ob Handlungswissen und das Wissen um die Wahrnehmungsergebnisse unserer Handlungen eine mögliche Erklärung für solide symbolische Mappings liefern können. In einer Reihe von Experimenten führten menschliche Versuchspersonen klangsymbolische Mappings durch sowie Mappings von ‘runden’ oder ‘scharfen’ Handaktionen Klänge mit den durch diese Handaktionen erzeugten Formen. Darüber hinaus wurden die auditiven und visuellen Reize beider Bedingungen gekreuzt. Die Probanden stellten bei allen Zuordnungen signifikant Kongruenzen fest, und, was am wichtigsten war, es wurde eine positive Korrelation ihrer Leistungen unter allen Bedingungen beobachtet. Physikalische akustische und visuelle Ähnlichkeiten zwischen den audiovisuellen Nebenprodukten unserer Handaktionen mit den klangsymbolischen Pseudowörtern und Formen zeigen, dass die Verbindung zwischen bedeutungslosen Sprachlauten und abstrakten visuellen Formen im Handlungswissen zu finden ist. Aus neurobiologischer Sicht stimmt die Verbindung zwischen Handlungen und den audiovisuellen Nebenprodukten unserer Handlungen auch mit den verteilten Handlungs- und Wahrnehmungskreisläufen im menschlichen Gehirn überein. Aktions- Wahrnehmungsnetzwerken, die durch die neuroanatomische Konnektivität zwischen auditorischen, visuellen und motorischen kortikalen Arealen des Menschen unterstützt werden, entstehen und tragen unter assoziativem Lernen das perzeptuelle und motorische Wissen unserer Handlungen. Diese Erkenntnisse geben eine neuartige Erklärung dafür, wie symbolische Kommunikation in unseren sensomotorischen Erfahrungen verknüpft ist. Zusammenfassend lässt sich sagen, dass die vorliegende Dissertation (i) das Vorhandensein von lautsymbolischen Effekten in einer Forced-Choice-Aufgabe validiert, (ii) zeigt, dass lautsymbolische Fähigkeiten spezifisch für Menschen sind, und (iii) dass Handlungswissen den mechanistischen Klebstoff liefern kann, um bedeutungslose Sprachlaute auf abstrakte Formen abzubilden. Insgesamt trägt die vorliegende Arbeit zu einem besseren Verständnis der phylogenetischen Ursprünge und des Mechanismus der lautsymbolischen Fähigkeit des Menschen bei

Institutional Repository of the Freie Universität Berlin

Neural pathways for visual speech perception

Author: Einat Liebenthal
Lynne E. Bernstein
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2014
Field of study

This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Multi-Sensoriality In Language Acquisition: The Relationship Between Selective Visual Attention Towards The Adult’s Face And Language Skills

Author: Bastianello T.
Publication venue
Publication date: 01/01/2022
Field of study

Introduzione Le componenti uditive e visive del linguaggio offrono al bambino informazioni cruciali per il processamento del parlato. L’abilità del bambino di integrare informazioni da diverse fonti multimodali (audio e visive) e di focalizzare l’attenzione sui segnali rilevanti presenti nell’ambiente circostante (selective visual attention) sono aspetti importanti che influenzano le prime fasi di acquisizione di una lingua. Alcuni recenti studi hanno ipotizzato e testato la relazione tra attenzione selettiva visiva verso specifiche aree del volto parlante (occhi o bocca) e le abilità linguistiche di bambini nei primi anni di vita. Molti ricercatori hanno speculato su come questa relazione potesse essere mediata dal livello di expetise del bambino, a livello linguistico (language expertise hypothesis), ma nessuno studio, fin ad ora, ha cercato di approfondire questa ipotesi, andando ad investigare le abilità linguistiche dei bambini usando misure di linguaggio spontaneo. Altri studi, hanno cercato di esplorare come diversi comportamenti attentivi verso specifiche aree del volto (occhi o bocca) fossero correlati alle abilità linguistiche concomitanti o longitudinali dei partecipanti. In molti casi, i risultati di questi studi hanno confermato l’esistenza di relazioni significative tra attenzione visiva selettiva e abilità linguistiche al tempo dell’esperimento o qualche mese dopo. Obiettivi L’obiettivo generale di questa tesi è quello di esaminare il fenomeno dell’attenzione selettiva visiva verso il volto e la sua relazione con lo sviluppo del linguaggio sia in un setting di laboratorio sia in un contesto naturalistico. In particolare, tre sono gli obiettivi specifici: - il primo obiettivo specifico è quello di sintetizzare e analizzare i fattori individuati dalla letteratura di riferimento che possono determinare diversi patterns di attenzione selettiva visiva nei bambini durante un compito audiovisivo. Ed in particolare, descrivere come la letteratura spiega questi patterns in relazione agli aspetti dello sviluppo del linguaggio; 8 - il secondo obiettivo specifico è quello di analizzare sperimentalmente l’attenzione selettiva visiva del bambino verso specifiche aree del volto (occhi e bocca) durante un compito di esposizione audiovisivo. In particolare, lo studio è volto ad indagare due aspetti. Il primo aspetto riguarda l’età e la condizione linguistica (esposizione ad una lingua nativa vs una lingua non nativa) dei partecipanti e come queste influenzano l’attenzione selettiva visiva verso specifiche aree del volto. Il secondo aspetto riguarda l’esplorazione dell’esistenza di una correlazione tra comportamento attentivo dei bambini la produzione vocale al tempo dell’esperimento e all’ampiezza del vocabolario tre mesi dopo; - il terzo obiettivo specifico è quello di capire se l’attenzione a volti o altre parti della scena visiva (oggetto, altre parti della stanza) è influenzato o spigato dalle abilità vocali del bambino al tempo del task e se gli episodi di fissazione al volto adulto possono essere predetti da specifiche proprietà fonologiche e semantiche del parlato del bambino. Metodo Per quanto concerne il primo studio, una rassegna sistematica della letteratura è stata condotta esplorando quattro fonti bibliografiche e usando specifici criteri di inclusione per selezionare la letteratura scientifica di interesse. Per quanto riguarda il secondo studio, i movimenti oculari verso un volto parlante la lingua nativa (Italiano) e non-nativa (Inglese) di 26 bambini tra i 6 e i 14 mesi sono stati tracciati usando l’eye tracker. Due gruppi sono stati creati sulla base dell’età (G1, M = 7 mesi, N = 15 bambini; G2, M = 12 mesi, N = 11 bambini). Ogni competenza linguistica del bambino è stata valutata due volte, al tempo dell’esperimento, attraverso l’osservazione diretta e tre mesi dopo, attraverso il MB-CDI. Due gruppi sono stati creati sulla base della produzione vocale dei bambini (vocalizzi pre-canonici, babbling, parole) attraverso un latent class cluster analysis: una classe vocale “alta” (percentuale di babbling e parole più alta) e una classe vocale “bassa” (percentuale maggiore di produzioni pre-canoniche). Per quanto concerne il terzo studio, il comportamento attentivo di 29 bambini tra i 12 e i 19 mesi è stato esplorato utilizzando sia una videocamera stazionaria 9 (posizionata di fronte alla diade) e una go-pro (posizionata sulla fronte del caregiver di riferimento) durante un semplice task linguistico (single object task). Durante il task i bambini sono stati esposti ad un set di stimoli audiovisivi, parole vere e non parole, scelte sulla base dei report dei genitori e sulle risposte al MB-CDI. Il comportamento attentivo dei bambini è stato codificato offline, secondo per secondo per un totale di 116 sessioni. La codifica ha riguardato specifiche aree di interesse (il volto, l’oggetto, o altre parti della stanza). La produzione vocale per ogni bambino è stata quantificata usando LENA e le produzioni del bambino (vocalizzi pre-canonici, babbling, parole) durante un periodo di gioco con la mamma sono state trascritte foneticamente. Risultati La rassegna sistematica della letteratura (Capitolo 2) ha portato all’identificazione di 19 articoli. Alcuni dei quali volti a chiarire il ruolo giocato da diversi fattori nel spiegare diversi patterns attentivi. Altri interessati ad indagare la correlazione tra l’attenzione selettiva visiva verso specifiche aree del volto alle competenze linguistiche o sociali dei partecipanti, aprendo le porte a diverse linee interpretative. Il primo studio empirico (Capitolo 3) ha messo in luce che i bambini italiani con età superiore ai 12 mesi, mostrano maggiore interesse verso l’area della bocca, specialmente quando esposti alla lingua nativa. Questo è in accordo con la recente letteratura, ma contrasta con la language expertise hypotesis (secondo la quale bambini attorno all’anno di età dovrebbero spostare il focus attentivo dalla bocca agli occhi). Il secondo risultato emerso in questo lavoro empirico riguarda l’interesse verso l’area della bocca per i bambini che hanno maggiori livelli di produzione in termini di babbling e parole al tempo dell’esperimento. Il terzo risultato riguarda l’associazione positiva tra il comportamento attentivo verso la bocca ed il vocabolario espressivo dei bambini misurato tramite questionario (MB-CDI) tre mesi dopo l’esperimento. Dal secondo studio empirico (Capitolo 4) emerge una differenza significativa in termini di tempo attentivo verso il volto adulto tra i bambini del gruppo linguistico “alto” e “basso” durante un task condotto in un contesto naturalistico. 10 In particolare, da questo studio emergono due risultati interessanti: il primo è che i bambini che producono forme vocaliche più avanzate (babbling e parole) guardano di più verso il volto adulto, specialmente quando esposti alle non-parole. Il secondo riguarda l’esistenza di una relazione significativa tra gli episodi di fissazione al volto e le abilità vocaliche del bambino al tempo del task (vocalizzi pre-canonici, babbling e parole). In particolare, emerge che la quantità di babbling prodotto ha un ruolo nel predire gli episodi di fissazione al volto durante il task, sia per le parole sia per le non parole. Conclusioni Diverse ipotesi linguistiche e sociali sono state avanzate per spiegare le differenze emerse dalla rassegna della letteratura in relazione al fenomeno dell’attenzione selettiva visiva. Gli studi empirici presentati in questa tesi hanno portato due contributi originali in quest’ambito di ricerca. Da un lato, i nostri risultati confermano l’idea che la bocca e, più in generale, il volto forniscono segnali visivi cruciali nelle prime fasi di acquisizione del linguaggio. Dall’altro lato, i risultati hanno messo in luce che la conoscenza linguistica e le abilità linguistiche dei partecipanti aiutano a spiegare diversi comportamenti attentivi. In altre parole, è possibile dire che l’attenzione selettiva ai volti, o a specifiche aree di questi, è spiegata dalle conoscenze e abilità linguistiche attuali dei partecipanti.Introduction Speech is the result of multimodal or multi-sensorial processes. The auditory and visual components of language provide the child with information crucial to the processing of speech. The language acquisition process is influenced by the child’s ability to integrate information from multimodal (audio and visual) sources and to focus attention on the relevant cues in the environment; this is selective visual attention. This dissertation will explore the relationship between children’s selective visual attention and their early language skills. Several recent studies with infant populations have hypothesised or tested the relationship between children’s selective visual attention towards specific regions of the talking face (i.e., the eyes or the mouth) and their language skills. These studies have tried to show how concomitant or longitudinal language skills can explain looking behaviours. In most cases, these studies have speculated on how this relationship is mediated by the child’s level of language expertise (this is known as the language expertise hypothesis). However, no studies until now, to the best of our knowledge, have investigated the child’s linguistic skills using spontaneous language measures. Aims The dissertation has one broad aim, within which there are three particular aims. The broad aim is to examine the phenomenon of selective visual attention toward the face in both a laboratory and a naturalistic setting, and its relationship with language development. The three particular aims are as follows. The first aim is to synthesise and analyse the factors that might determine different looking patterns in infants during audiovisual tasks using dynamic faces; it describes how the literature explains these patterns in relation to aspects of language development. The second aim is to experimentally investigate the child’s selective visual attention towards a specific region of the adult’s face (the eyes and the mouth) in a task using the eye-tracking method. In particular, the study will explore two 12 questions: First, how do age and language condition (exposure to native vs non-native speech) affect looking behaviour in children? Second, are a child’s looking behaviours related to vocal production at the time of the experiment and to vocabulary rates three months later, and if so, how? The third aim is to understand whether selective attention towards the face or other parts of the visual scene (i.e. the object or elsewhere) is influenced or explained by the child’s vocal skills at the time of the task. And can the episodes of fixation towards the adult’s face be predicted by specific phonological and semantic properties (i.e., pre-canonical vocalisations, babbling, words) of the child’s speech? Method For the first study, a systematic review of the literature was conducted, exploring four bibliographic databases and using specific inclusion criteria to select the records. For the second study, eye movements towards a dynamic face (on a screen), speaking in the child’s native language (Italian) and a non-native language (English), were tracked using an eye-tracker in 26 infants between 6 and 14 months. Two groups were created based on age (G1, M = 7 months, N = 15 infants; G2, M = 12 months, N = 11 infants). Each child’s language skill was assessed twice: at the time of the experiment (through direct observation, Time 1) and three months later (through MB-CDI, Time 2). Two groups were created, based on the child’s vocal production (Time 1, latent class cluster analysis): a high class (higher percentage of babbling and words) vs a low class (higher percentage of pre-canonical vocalisations). For the third study, the looking behaviour of the same 29 children between 12 and 19 months was tracked, using both a stationary video camera and a head-mounted camera on the mother’s head during a single object task. During the task, children were exposed to a set of audiovisual stimuli, real words and non-words, chosen based on the parents’ reports and their MB-CDI answers. The child’s looking behaviour was coded offline second-by-second for a total of 116 sessions. The coding relates to specific areas of interest, i.e., the face, the object or 13 elsewhere. The vocal production of each child was quantified using a LENA device, and their speech during a play period with their mothers was transcribed phonetically. Results The systematic search of the literature (Chapter 2) identified 19 papers. Some tried to clarify the role played by audiovisual factors in support of speech perception (provided by looking towards the eyes or the mouth of a talking face). Others related selective visual attention towards specific areas of the adult’s face to the child’s competence in terms of linguistic or social skills, this leads to correspondingly different lines of interpretation. The first empirical study (Chapter 3) shows that Italian children older than 12 months displayed a greater interest in the mouth area, especially when they were exposed to their native language. This accords with the more recent literature but contrasts with the language expertise hypothesis. The second significant result of Chapter 3 is that children who had a higher level of production in terms of babbling and words at the time of the experiment looked more towards the mouth area. The study reported in Chapter 3 also demonstrated a positive association between the child’s looking to the mouth and their expressive vocabulary as measured (using the MB-CDI) three months after the experiment The second empirical study (Chapter 4) shows a significant difference in the looking time towards the adult’s face between children with low- and high-vocal production in a naturalistic setting. More specifically, from this study, we find two things. Firstly, we found that the children who produced more advanced vocal forms (higher amount of babbling and word production) looked more towards the adult’s face, especially when exposed to non-words. Secondly, that a significant relationship exists between the episodes of fixation towards the adult’s face and the child’s vocal skills (i.e., pre-canonical vocalisations, babbling, words); babbling productions predicted the episodes of face fixation in the task as a whole, for both words and non-words. 14 Conclusion Linguistic and social-based hypotheses attempting to explain the differences in the selective visual attention phenomenon emerged from the literature review. The empirical studies presented in this thesis bring two original contributions to this research field. First, our findings reinforce the idea that the mouth and, more generally the face, provide crucial visual cues when acquiring a language. Secondly, our results demonstrate that language knowledge and language skills at the time the child was observed significantly help to explain different looking behaviours. In other words, we can conclude that each child’s attention to faces is shaped by their own linguistic characteristics

Catalogo dei prodotti della ricerca