Search CORE

6 research outputs found

Audiovisual integration in speech perception: a multi-stage process

Author: Andersen Tobias
Eskelund Kasper
Tuomainen Jyrki
Publication venue
Publication date: 01/01/2011
Field of study

Evidence of correlation between acoustic and visual features of speech

Author: F Berthommier
J P Barker
Publication venue
Publication date: 01/01/1999
Field of study

ABSTRACT This paper examines the degree of correlation between lip and jaw configuration and speech acoustics. The lip and jaw positions are characterised by a system of measurements taken from video images of the speaker's face and profile, and the acoustics are represented using line spectral pair parameters and a measure of RMS energy. A correlation is found between the measured acoustic parameters and a linear estimate of the acoustics recovered from the visual data. This correlation exists despite the simplicity of the visual representation and is in rough agreement with correlations measured in earlier work by Yehia et al. using different techniques. However, analysis of the estimation errors suggests that the visual information, as parameterised in our experiment, offers only a weak constraint on the acoustics. Results are discussed from the perspective of models of early audio-visual integration

CiteSeerX

Η ΕΠΙΔΡΑΣΗ ΤΗΣ ΣΗΜΑΣΙΑΣ ΤΩΝ ΛΕΞΕΩΝ ΣΤΗΝ ΥΠΟΘΕΣΗ ΤΗΣ ΕΝΟΤΗΤΑΣ ΤΗΣ ΠΟΛΥΑΙΣΘΗΤΗΡΙΑΚΗΣ ΑΝΤΙΛΗΨΗΣ

Author: Boura Evangelia
Μπούρα Ευαγγελία
Publication venue
Publication date: 01/01/2022
Field of study

Η πληροφορία που έχουν στη διάθεσή τους διαφορετικά αισθητηριακά συστήματα συχνά ενοποιείται για να σχηματιστεί ένα ενιαίο αντίλημμα, όταν το επιτρέπει το κοινό χρονικό και χωρικό πλαίσιο αναφοράς καθώς και άλλοι ανώτεροι γνωστικά παράγοντες (υπόθεση της ενότητας). Έτσι, για παράδειγμα, η ομιλία συνίσταται από ένα οπτικό ερέθισμα (π.χ. κινήσεις χειλιών, μυών του προσώπου) και ένα ακουστικό, τη φωνή, τα οποία συντίθενται σε ένα αδιάσπαστο αντιληπτικό γεγονός. Τί ορίζει όμως το τελικό προϊόν της μέγιστης ενοποίησης, η οπτικοακουστική συνέπεια των σημάτων ή η σημασιολογική συνάφεια του εκφερόμενου λόγου; Στην παρούσα μελέτη λοιπόν εξετάσαμε την επίδραση λέξεων και ψευδολέξεων στην υπόθεση της ενότητας. Οι λέξεις/ψευδολέξεις που χρησιμοποιήθηκαν είναι δισύλλαβες, ίδιων χρονικών και φωνολογικών χαρακτηριστικών. Τα οπτικά και ακουστικά ερεθίσματα εμφανίστηκαν σε συμφωνία, δηλαδή προβολή βίντεο και ηχητικού της ίδιας λέξης ή ψευδολέξης με μια σειρά από χρονικές αποκλίσεις μεταξύ των σημάτων. Οι συμμετέχοντες έκριναν την χρονική σειρά ή συγχρονία των ερεθισμών. Μέσα από τα δύο αυτά έργα εξετάσαμε το μέγεθος της ενοποίησης στη συνθήκη των λέξεων (σημασιολογικός παράγοντας) σε σχέση με τη συνθήκη των ψευδολέξεων (που δεν φέρουν σημασία). Υποθέσαμε ότι οι λέξεις ευνοούν ισχυρότερη ενοποίηση των οπτικοακουστικών σημάτων και επομένως αναμέναμε ότι οι συμμετέχοντες θα είναι λιγότερο ακριβείς στις κρίσεις σειράς και συγχρονίας σε σχέση με τις κρίσεις τους κατά την παρουσίαση ψευδολέξεων. Η υπόθεσή μας επαληθεύτηκε στο έργο κρίσης συγχρονίας. Τα εύρημα αυτό ενισχύει τη θεώρηση της υπόθεσης ενότητας δείχνοντας ότι η χρονική και χωρική σύμπτωση δεν αρκούν για τη μέγιστη ενοποίηση σύνθετων πολυαισθητηριακών ερεθισμών όπως ο λόγος.Signals across different modalities are often unified into a single percept under the influence of stimulus-driven factors, such as temporal synchrony and spatial location, and higher-level cognitive factors, such as the unity assumption. Audiovisual speech is such a typical event consisting of a visual stimulus (lip and facial movements) and an auditory stimulus (voice). The question though is what determines the maximal integration, is it the consistency of the audiovisual signals or their common semantic content? In the present study we examined the effect of words and pseudowords on the unity assumption. The words and pseudowords we used shared common temporal structure (two syllables long) and common phonology. Visual and auditory stimuli were matched and presented under different stimulus onset asynchronies. Participants had to perform a temporal order judgement or a simultaneity judgement task. Our aim was to compare the strength of the audiovisual integration between words (the effect of semantics) and pseudowords (void of any meaning). Our hypothesis was that words would induce a stronger binding of audiovisual signals and therefore the participants would perform worse in the two tasks during the presentation of words compared to their performance during the presentation of pseudowords. Our hypothesis was confirmed for the simultaneity judgement task. This finding provides support for the influence of the unity assumption in multisensory integration proving that spatiotemporal coincidence is not enough for the optimal binding of complex audiovisual stimuli such as speech

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens

Electrophysiological assessment of audiovisual integration in speech perception

Author: Eskelund Kasper
Publication venue: Technical University of Denmark
Publication date: 01/01/2014
Field of study

Online Research Database In Technology